8.06 Spring 2016 Lecture Notes 


1. Approximate methods for time-independent Hamiltonians 


Aram Harrow 


Last updated: February 17, 2016 


Contents 


1 


Time-independent perturbation theory 
1.1 Non-degenerate case... 2. ee 
1.2 Degenerate perturbation theory .... 2.2... 0.0. ee 


The Hydrogen spectrum 
2) Wine SirighWe.. . «nce ee he ee Re eel hE G EARS Ew hee ESE ES 
2.2 Hyperfine splitting ....5 . 2.43 4540444 608 +4 eRe eRe ee EAS 
2.5 Aeemian CHOC 2. 5 bee LE Se EE RE ES aw ee ew ewe eS 


WKB 
DA rodieen tO WERE. «24+ .64 4828 Oeeb bee Reet R Eee eae wa DEA 
oo valu Gl WKB wae 2 be eR Ee EP ee ea ae eS 
3.3 Bohr-Sommerfeld quantization . 2... 2... 2 ee 
3A Connection formulae... : .. <4 6442468 85644 G8 GS ee REE awe Ree SRS 
Oo UNWEMOE og ag ee ad a Sb Re eee ee LE ee ea Pate aes 


Often we can’t solve the Schrédinger equation exactly. This is in fact almost always the case. 
For example, consider the van der Waals force between two Hydrogen atoms. What happens then? 
Do we give up? 


Of course not! We use approximate methods. The guiding philosophy is 
e Reduce real system to toy model that can be exactly solved. 


e “Solve” actual Schrédinger equation in some (hopefully controlled) approximation. 


Our strategy will depend on how our real system is close to our toy system. If the difference is small 
(meaning not very much energy) we use perturbation theory. If we are dealing with a Hamiltonian 
changing slowly in time we use the adiabatic approximation and for a Hamiltonian varying slowly 
in space we use WKB. We can also consider perturbations that are localized in space, which leads 
to the framework of scattering. 


1 Time-independent perturbation theory 


1.1 Non-degenerate case 
1.1.1 Setup 


Suppose 
H=H)+ 06d, 


where Ho has a known eigenspectrum 
Ho|n°) = E2|n°) fi Oe 2s 4a, 


and 6H is a “small” perturbation. We can make this precise by saying that ||dH|| = O(A), and A is 
a small dimensionless number (say around 0.01). (Here the norm of a Hermitian matrix is defined 
to be the largest absolute value of its eigenvalues.) Some examples are: 


e Relativistic effects: A ~ u/c. 
e spin-orbit coupling: A ~ a & 1/137. 
e weak F or B field 


We will solve for |n) and E,, order by order in 4; i.e. 


The definition of |n") is the O(A*) piece of |n). Formally |n*) = k! 25 |n)|,=0 and Ek = k!1 2 Enla-o- 
While we know the energies and eigenstates of Hp, we may not know so much about 6H. However, 
we will need to assume that at least we know its matrix elements in the unperturbed eigenbasis. 
Denote these by 


6Himn = (m°\6H|n°). 


Today we will consider the non-degenerate case; i.e. when E®, 4 E® form ¢n. Next time we 
will consider the degenerate case. Your antennae should be going up at this: the difference between 
E® # E° and E®, = E° can be arbitrarily small, so how can they really lead to different physical 
theories? In fact, we will see that for non-degenerate perturbation theory to make sense, the energy 
levels need to be not only different, but also far enough apart, in a sense that we will make precise 
later. 


Difference from Griffiths We will work in a basis where (n°|5n) = 0. Equivalently (n°|n*) = 0 
for all k > 0. Since |n®) is normalized, this means that 


(n\n) = (n®\n°) + (én|in) = 14 (dnjdn) > 1. 
Of course |n) is still a valid eigenvector even it is not a unit vector. But if we want a unit vector, 


we will need to take 
|n°) + |dn) 


rom = Vi4 Gnlon). 


2 


This convention is used in Sakurai. 
As a result of this convention 


(n°|n) = 1 (1) 
(m°\n) = (m®|5n) for m#n (2) 
1.1.2 Perturbative solutions 
We want to solve the eigenvalue equation 
H|n) = E,|n). (3) 
Instead of expanding every term, we will make choices that will be justified in hindsight: 
(Ho + 6H)|n) = En(|n°) + |6n)) (4) 


Thus we avoid for now expanding |n) on the LHS and E,, on the RHS. Now left-multiply (4) by 
(n°| and use (n°|Ho = (no|E® and (n°|n) = 1 to obtain 


(no|SH|n) = E,, — EQ. (5) 
What if we instead left-multiply by (m®| for some m 4 n? Then we obtain (using (m°|n°) = 0) 
(m°|5H|n) = (En — Ey,)(m°|5n). (6) 


So far everything is still exact, but further progress will require approximation. We now solve (5) 
and (6) order by order in A. 
Replacing the |n) in (5) with |no) + O(A) we obtain 


En = E® + (n°|5H|n°) + O10?) |. (7) 


This is the first-order energy shift. It will soon become an old friend. 
Performing the same substitution in a rearranged version of (6) yields 


— (m®|6H|n) — (m®|6H|n°) + O(A?) 


0 
(mn '|6n) = "peo eR BD FOL) (8) 
Repeating for all values of m 4 n (recall that (n°|6n) = 0 by fiat) we obtain 
m°|6H|n 
|in) = S$ |m Pee + O(d”) (9) 
mén my 


This yields the first-order shift in the wavefunction. We need to be a little careful here. Clearly 
we are using the non-degenerate condition here by assuming that E? — E°, 4 0 forn 4m. But we 
have actually used a robust version of this assumption 

On to second order! 


En = E® + (n°|5H\n) exact (10a) 
= Ey + (n°|6H|n°) + (ni ay still exact (10b) 
H 2 
= EX + 6Hnn + (n°|56H S> |m ree me OW ) using (9) (10c) 
men He 
Otel 
= Fo + bHyn Ki - ‘op + O(A?) (10d) 


We will generally not need the second-order shift in the wavefunction, but it can be computed to 


be 
k°\ 6H, bin k° pa n 
=v ea iar 7 (14) 
ae | l 


For higher-order corrections, Sakurai is the best reference. For 8.06, we will never go beyond second 
order in energy or first order in wavefunction, although below we will see that (11) is relevant to 
degenerate perturbation theory. _ 

To summarize, we have 


LE 
El = 6H 
2 — Sr PH 
oz, En Em 
n°) = |n°) 
1 0 (m°|6H|n°) 
In") = ae 0 Eo 


What about normalization? We should multiply by (1 + (dn|én))~1/? = 1— 4(n4|n4) + O(A4) so 
this affects only |n?) and not |n*). 


1.1.3 Energy shifts of the ground state 


The first-order energy shift can of course be either positive or negative; e.g. suppose 6H = +QXI. 
But there is one thing we can always say about it: it always overstates the true ground-state energy 
of the perturbed system. Here is the proof. The first-order estimate of the ground-state energy is 


EQ + Eq = (0°|Ho|0°) + (0°|5H|0°) = (0°|H0°) > (0|H|0) = 


The second equality is from the identity H = Hj+06H and the inequality is the variational principle: 
(~|H|wW) > (0||0) for all unit vectors |7). 

In this case, we would hope that the second-order term Ej would improve things by being 
negative. And this is indeed the case. 


- es ate. 
eT tee! 


Every term in the sum is < 0 so we always have E@ <0. 
More generally at 2nd order we observe “level repulsion.” The n’th energy level is pushed up 
by levels with m < n and pushed down by levels with m > n (assuming that Ey < E, <...). 


oy) 


1.1.4 Range of validity and a two-state example 


As we go to higher orders of perturbation theory, we multiply by entries of 6H (e.g. dHmn) and 
divide by differences of eigenvalues of Ho, e.g. E2 — E®,. So the perturbation has to be small w.r.t 
the level spacing. See diagram on blackboard plotting F(A) as a function of X. 


Here’s probably the simplest possible example. 


EB AX Ee 0 0 A 
H-= —- + 
\ £F 0 EP A 0 
—[—S—F=issa@——_ 
Ho 6H 


Let’s write Ej, E) = E+ A so that Hp = E + Ac,. The first-order shifts are 6Hoo = 5H, = 0. 
The second-order shifts are 
gs \5Ho1|? es Ae 

a BBO °K 

2 |5Ho1|? - OX 

po Pe BO: DA 
Of course this problem can be solved directly more easily, as you will explore on your pset. There 
you will find that the energy levels are (exactly) 


Eo, Fy =E+ Vv A2 + A? 


In the |A| > |A| limit this can be written as 


while in the |A| >> |A] limit we can write this as 


Barfi+($ 


The splitting in energy levels is 2A for \ = 0 and then has a O(A?) term for small \ and ana 
becomes approximately linear in A for large A. This type of behavior is called an “avoided crossing” 
because of the fact that generically Hamiltonians tend to have non-degenerate eigenvalues. See 
figure drawn in lecture. 


1.1.5 Anharmonic oscillator 


2 DD 
p Mw x 4 
qe 12 
fat oe (12) 
oH 
Ho 


How does the ground-state energy change? The unperturbed ground-state energy is E° = zhw. 
Using @ = \/74,(a +a!) we can calculate 


Here is an alternate derivation using integrals. This one is hairier, so let’s set h = m=w=1. 
2 . ; 
We will use (2|0) = Ne~* /?, for some normalization N. Then we can compute 


oo —a? 4 
fiat dxe~* x 


fr dze-® * 


(oe) 


Bs = X0\z*|0) => 
Here our job becomes easier if we introduce a parameter. 


[eu ene — (2 
a 


1 
«fave = [aeajr = oe 
da 


2 a3/2 


a? ax? __ 4 —ar2 __ 1 3 Vr 
ig | ave = f ave = 5) ( >) ay) 
Thus (z*) = 3/4. 


Moral of the story: Gaussian integrals involve some beautiful tricks that you should learn. But 
raising and lowering operators are easier. 


1.2 Degenerate perturbation theory 
1.2.1 Overview 


The first-order corrections to the wavefunction and the second-order corrections to the energy all 
have factors of E° — E®, in the denominator. So when two energy levels become equal, these give 
nonsense answers. But in fact, even the first-order energy shift will be wrong in this case. Let us 
revisit the case of two-level systems. Suppose that Ej = Et. For simplicity, assume E} = E} = 0 
so the overall Hamiltonian is 


0 2X 
vA 0 


H=6dH= 


The eigenvalues are +A, which is first order in 6H. But the diagonal elements of 6H are zero, so 
(7) would say that the first-order energy shifts are zero. To summarize, the first-order energy shift 
is wrong, and the second-order energy shift and first-order wavefunction shift are infinite. The 
situation looks grim. 

There is one point in the above paragraph where I pulled a fast one. The reference to “diagonal 
elements” refers to the eigenbasis of Hy. In the above example, this is labeled |0°) and |1°), 
which we took to be | t) and | |) respectively. But since Ho has degenerate eigenvalues, the 
corresponding eigenvectors are not unique. We can make a unitary change of basis within the 
degenerate eigenspace and obtain new eigenvectors. In that example, we could take 


|0°) ~ and 1°) = | t) as | t) 


V2 


so that the matrix for 6H would be diagonal in the {|0°),|1°)} basis. In this case the first-order 
energy shift is exactly correct and the second-order energy shift is zero. 


General rule for degeneracy in Ho. For a general Hamiltonian where Hp has degenerate 
eigenvalues, the strategy is: 


1. Choose an eigenbasis {|n°)} for Hp where 6Hmn = 0 for each m 4 n with E®, = E®. This is 
called a “good basis.” 


2. Apply non-degenerate perturbation theory to handle the remaining off-diagonal terms in 6H 
(which now will only be nonzero when E?, 4 E®). 


Why is this possible? By the spectral theorem, we can write Ho as 


where II; is a projection operator supported on the space of E?-eigenvectors. Call this space Vj. 
We will think of Hp as block diagonal with blocks corresponding to the subspaces Vj, V2,.... 

We can write 6H = )7; ; Ij6HI;. The diagonal blocks are the components of the form II;6 HT];. 
These are the blocks we would like to diagonalize. Since each II;6H11]; is a Hermitian matrix we 


can write 
dim V; 


Th 6HU: = >> Aiali,a)(é,al (14) 

a=1 
where A; , are the eigenvalues of I;d H1I; (when considered an operator on V;) and {|i, a) }a=1.,....dim V; 
forms an orthonormal basis for V;. Since each |i,a) € V;, we also have Ho|i,a) = E?|i,a). Now 
the first-order energy shifts are given by the A; and the second-order energy shifts and first-order 
wavefunction corrections involve off-diagonal elements of 6H between eigenvectors of Ho with dif- 
ferent eigenvalues. Jf this fully breaks the degeneracy of Ho, then we can follow the lines of the 


non-degenerate case. We first introduce a little more notation: let n stand for the pair (7, a), denote 
|n)rotated — |; a) and let 6H™*ted denote 5H in the {|n)'°t*e¢} basis. 


rotated |2 
\OEL an 


a5 + O()3) (15) 


En = En + dHzoteted 4S 
m:E0,#£Eo -” 


The above equations work in many cases, including all the examples on problems sets or exams 
that you will encounter. However, it may be that the diagonal elements of 6H do not fully break 
the degeneracy. In this case, higher-order corrections may encounter new degeneracies which may 
require new changes of basis+. In the next section we will describe systematically what to do in 


this case. 


"Here is an example. Take \ < A and let 


- CO OC 
>. 


Graphically, Hp might look something like 


BE} 


We can write 6H in this basis and generically it could have every matrix element nonzero, e.g. 


SH = (17) 


* KK *K K | XK 


* KK *K K | XK 


By choosing a new basis for each block, Hp stays the same and 6H takes on the form 


6 Hrotated = (18) 


* *1/0 0 */] x 


* OK | Kk K * | * 


There are still off-diagonal terms but only between different eigenvalues of Ho. 

Why is it reasonable to ask that we find a new basis in this way? After all, the whole point 
of perturbation theory was that it was too hard to diagonalize Hop + 6H. However, now we only 
need to diagonalize 6H in each block of degenerate eigenvalues. Typically these will be much lower 
dimension than the overall space. 


1.2.2 First-order wavefunction correction - degenerate case 


Note: This section is included for completeness, but contains material that goes beyond lecture and 
will not appear on any problem sets or exams. 

What about the perturbed wavefunction |n)? The zeroth order wavefunction should be |n° 
It is tempting to state that the first-order correction to |n) = |i,a) is 


yousied ; 


dim V; rotated 


| 1 ue S° | Gyronited ores = > ‘S | . p)rotated jb,ia (19) 
es a Eo — Fo, : Eo — Fo 
m:E9, AE ” Ue j#i b=1 a J 


but this is only part of the story (the isa warning that this equation is not quite correct). (19) 
does indeed describe the contribution to |i,a) from states |j,a°)rtted with i 4 j, but there are 
also contributions within the same block, i.e. from states with 7 = 7. This can be thought of as 
representing the need to further rotate our rotated basis to account for degeneracies that arise at 
higher order in perturbation theory. 

Another way to think about that is that if we go to second order in perturbation theory, we get 
the contribution (following (11)): 


|\n?) zt li a?) aa S- 3 lj, pyrotateds Frnt OH peg 7, pete Tore 
| (3,b)A(i,a) (k,c)A(é,a) (Eia — Ej) (Eia — Exc) (Eia — Eye)? 


(7,b)A(é,a) 

(20) 
Is this really a second-order (i.e. O(A?)) correction? First look at the second term. Because we 
have rotated into a block-diagonal basis 6Hjp,iq is zero unless i 4 j7. Thus the denominator is O(1) 
and the numerator is O(A?), and the second term gives a O(\7) correction. What about the first 
term? Now the block-diagonal constraint means that only the i 4 k # 7 terms survive, and again 
the numerator is O(\). However, it is legal to have i = j (as long as a 4 b. In this case, the energy 
splitting between 7,a and i,b is O(A). Thus, this term contributes O(%) = O(A). In other words, 
it is a first-order contribution, despite appearing in the expansions of the second-order term. (For 
similar reasons, (20) gives only part of the true second-order contribution, for which we need to go 
to third order.) We conclude that the true first-order correction to the wavefunction is 


dim V; dimV; |. 
are |Z, pyrorened | ee 6 Ht. 
i, b) aa — ht ia ib Je = 1a 21 
a= Oe 2 ee 2 Gaga 


1.2.3 Hydrogen preview 
An important application of degenerate perturbation theory is to the spectrum of hydrogen. Here 


D € 


Hy = 
2 2Me r 


The eigenbasis can be taken to be |n,/,m,ms) where n is the principle quantum number, | denote 
total orbital angular momentum, m its z-component and m, the z-component of the electron 
spin. Another valid basis is |n, 1 L j,mj) where J denotes the total overall angular momentum (i.e. 
corresponding to the operator J? where J=L+S 5) and m; its z component. 

In both cases, the eigenvalues of Hg depend only on n and all the other degrees of freedom are 


degenerate. Indeed 
1 m.e* 13.6eV 


0 _ 
Entmm, n2 2h2 n? 
Next week we will discuss a number of corrections to this that are smaller by factors of either 
—_ e? ~ Me w 1 Mee? . 
a= i & ce ae fine structure constant) o T me © TB00° (Indeed even “5- can be written as 


Mec?a’, where mec? is the energy scale from the energy from the rest mass of the electron. And 


this of course is smaller by a factor of roughly ** than the mass of the entire atom. We ignore 
these larger energies in what follows because they are not relevant to experiments in which the 
electron or proton are not created or destroyed.) The contributions to the energy of hydrogen are 
summarized as follows. 


We mention here also the “spectroscopic notation” convention, which is used for the coupled 
basis. The state |n,1, j,m,;) is written as nL; where “L” is a letter that expresses the orbital angular 
momentum according to the rule: 


what where discussed magnitude 


Oth order 8.04/8.05 Eon Mee" ~ Mea? 
fine structure 8.05 Ey ~ mec2at 
Lamb shift QFT Etamb © Mec2a° 
hyperfine structure 8.05 Eng ~ ae Oe 
proton radius pset ~ Mec?a? (2) 
Zeeman effect Griffiths depends on B field 
Stark effect pset depends on E field 


Table 1: Contributions to the hydrogen energy levels. 


Some examples are 


Si. «1 OO 1/2 


2542 2 O 1/2 
OPys. -O oh. 08 
OP 62 “82 


1.2.4 Two-spin example 


Let’s see how these ideas work with a simple example. Consider two spin-1/2 particles with Hamil- 
tonian 


Eos = Eo 
b= 
To diagonalize this, define J= Si + 35 and observe that J? = 8? + 83 + 25; . Sb. Thus 


BP 9-82 (F ) 


(Sz; ® Sz + Sy @ Sy + Sz ® Sz). 


Ho 


~ Fe 2 2h 4 


The eigenvalues of J? are 0 (degeneracy 1) and 2h? (degeneracy 3); these correspond to total spin 
0 and 1 respectively. Thus spectrum of Ho is —3Ep (with degeneracy 1) and 5 £0 (with degeneracy 
3). 


10 


Case 1 Now suppose we add a perturbation 


A A 
6H = 5 (Sea + Sz) = Roe 


(This might arise from applying a magnetic field in the Z direction.) We need to choose a good basis. 
Fortunately, 6H and Hp commute and the coupled basis |j,m) (with 7 = 0,1 and —j < m < j) 
works. In this basis we have 


Holjsm) = Ey (EE Y — 2) Lim 


6 H\j,m) = Am|j,m) 


[Draw energy level diagram of this.] 

This was too easy! When everything commutes, this is what it looks like. Of course, we 
could have chosen a more foolish eigenbasis of Hp. Any eigenbasis would to include the singlet 
|0,0) = ee but it could be completed with any additional three orthonormal states 
in the triple space. If we chose these to be anything other than |1,1),|1,0),|1,—1), then even 
first-order perturbation theory would give the wrong answer. (Working this out is an exercise left 
to the reader.) 


Case 2. Let us try a slightly more interesting perturbation. 


A 
6H = = (S21 — S22). 


This could arise from applying a magnetic field to positronium. We calculate its matrix elements 
in the coupled basis as follows: 

6H|1,1) = dH|+) @|+) =0 

6H|1,-1) = dH|—) @|-) =0 

| 


a1.0) < oH DSI +H OM) _ AH@l-)-H)@l4) _ ag 9 
V2 V2 
Since 0H is Hermitian we know also that 6H|0,0) = AJ1,0). 

Thus, in the coupled basis we have 
PaO a 20 LT 10: Lk 100 
11 /2 0 0 0 1,1 0 0 0 0 
. HO 0 # O90 0 — 1,0 0 Oo 0 A 
POT ep lage. og VEE 2 ae Oy al Ge MG: GO 
0,0 0 0 0 = Eo 0,0 0 A 0 0 


(22) 
First-order perturbation theory (correctly) gives us zero energy shift to first order in A. The 
second-order shifts are 


r= Manag 3 
(1,0) ~ *“(0,0) 0 
20 |5A0,0),(1,0)|7 = iM 
ee Foo) ~ Fh) Fo 


where we have used the fact that Eo = EY, - E}_o- 


11 


2 The Hydrogen spectrum 


See Section 1.2.3 for an overview. 


2.1 Fine structure 


The term “fine structure” refers to three different contributions to the energy that are O(m-c?a*), 
compared with the O(mec?a*) zeroth order contribution: relativistic corrections, spin-orbit cou- 
pling and the Darwin term. The contributions all arise from the Dirac equation for a particle with 
charge q and mass m: 


s+ > V7 
Hpirac = ca: (p _ 4) + Bme? ar q¢, 


where A, @ are the vector and scalar potentials of the EM field, and @ and £ are given by 


S 
A 
— 


0 
and B= =o, 81. 


Qi 
l| 
l| 
Q 
8 
& 
Q 


The Hilbert space here is the space of well-behaved functions from R* —> C as well as a four- 
dimensional discrete space. Why four dimensions? Two are to account for spin, and another two 
are to allow a particle to be either an electron or positron. While the Dirac equation was originally 
motivated by the need to properly account for the fine structure of hydrogen and to unify quantum 
mechanics with special relativity, it as a bonus generated the prediction of antimatter. 

We will not further explore the Dirac equation in 8.06, but at this point we should observe 
its symmetry under collective rotation. That is, if R is a 3 x 3 rotation matrix and we replace 
(01, 02,03) with ()0; Rioi, 0; Ri2oi, )); Rigoi) and (p1, p2, p3) with (0; Ripi, 0; Riepi, 0; Rispi) 
(and similarly transform A, @) , then HP#° is unchanged. 

This means that H?** commutes with the collective rotation operators J=L4+89 , although 
not necessarily the individual rotations L and 9. As a result, a good basis for the Dirac-equation 
version of hydrogen is likely to be |n,1,7,m,) (since the Hamiltonian should be block-diagonal in 
n,j and independent of m,;) instead of the alternative |n,1,mj;,ms). After much calculation we will 
see this fact confirmed. 


2.1.1 Relativistic correction 


Let’s first do some back-of-the-envelope estimates of how important relativity is to the hydrogen 
atom. The unperturbed ground-state wavefunction is 


et/a0 he 1 A 
Pool?) ===, 9 %@~=—=——: 
V Ta me amc 
We can then estimate k 
pre —=amc 
ag 
Pee Ae 
ve —=ac 
m 


Thus the electron velocity is + 1/137 the speed of light. This is non-relativistic, but still fast 
enough that relativistic corrections will be non-negligible. Let’s compute them! 


12 


KE = Vm? + Pe — me 


5 Me eo. 
2m 8m3 C2 


SNe Oa 


usual term 6Ay) 


The first-order contribution to the energy is 


Enuj = {n,l, j|6Hreln, l,j) independent of m; 
p 
= (n, 1, j| = Sinden hd) 


= calculation omitted, see textbooks 


atme { 4n 
8n4 ( ae 
2 
One key feature of this calculation is that the answer indeed scales as amc? (since _ ~ (oes = 
a*me’) which is smaller than E® by a factor of a”. Also note that the answer depends on 1 but 
not j. This is not surprising since spin never appeared. But it is inconsistent with the prediction 
from the Dirac equation that the energy should depend instead on j. To get there we will need to 


consider additional terms. 


4 


2.1.2 Spin-orbit coupling 
See Lecture 23 of the 2015 8.05 notes (or Griffiths) for more detail. The general formula for the 


> 


energy of a dipole in a magnetic field is 6H = —j- B. For an electron, the dipole moment is 
is -_ 
fle=— HB Ge 2 =——_S. 
wee A mc 
eh 
2mc 


Compute the B field from the proton in the rest frame of the electron, which see the proton orbiting 
it. If F is the vector pointing from the proton to the electron, then the magnetic field strength is 


- Uv er evUxr e L 
B= a an 3. 3 
Cc Tr Cc mer 


™, mM, : . . 
nei ? which we can approximate with 
erp 


m® me.) Putting this together we get a semi-classical estimate of 6H: 


(In both cases, the “mass” is technically the reduced mass 


e2 Sy 


S-L. 


5 semi-classical ~ 
mcr 
This is close, but not quite, the true answer that can be obtained from the Dirac equation, which 
is exactly half the semi-classical estimate. 
2 


> lS 


lee. a 


Now we compute the first-order correction to the energies. We will use the facts that 


5 Dirac = 


2 
(nlj|$ +» L|nlj) = r (iG +1)-I(l+1)- 3) (23a) 


(nlj|—|nl) - (see Griffiths problem 6.35(c)) (23b) 
r 


n3l(1+ 4)(I+ 1)ag 


13 


Now we can calculate: 


‘ SS 
Eniy,so _ Im2c2 (nbj| va |ntj) 
atme (nj +1) -t0 +1) - 9) 
4n4 ll+4)(+1) 


While the exact form of this equation is a bit hairy, we can see that its order of magnitude is 


~ atmc’, which is comparable to the relativistic correction. Also note that now there is a j and | 


dependence, because spin is part of the picture as well. 


2.1.3 Fine structure 


Combining the relativistic and spin-orbit coupling leads to a miraculous cancelation 


An 2 
a*mc 4n 
Engrs = Entre + Enso = Bnd : i+ 7) (24) 
2 


This derivation used the fact that 7 = 1+ 1/2 and can be proved to hold separately for each case 
j =141/2 and j = 1-1/2. An additional complication is that the above formulas for the relativistic 
and spin-orbit corrections were not quite right when / = 0 (e.g. the spin-orbit coupling should really 
be zero then, and p* is not a Hermitian operator for | = 0); additionally, there is a third correction 
of order a*mc? called the Darwin shift, due to the fact that the electron is delocalized across a 
distance given by its Compton wavelength (which is ~ aag). However, this term affects only the 
1 = 0 term and, together with the correct | = 0 versions of the spin-orbit and relativistic couplings, 
ends up giving precisely the formula in (24). 

To summarize, while the previous calculations had limited validity, (24) is exactly correct for 
all values of n and j. It can also be derived directly from the Dirac equation. We reassuringly find 
that it depends only on n and j and not on other quantum numbers. 


Draw a diagram showing the n = 2 states of hydrogen. The zeroth order energy is — aPme® . The 


fine structure then contributes energy —jatme to the 2S} /2 and 2P,/2 states and energy — ane to 


the 2P3/2 states. Are the 25} /2 and 2P;/ states degenerate? It turns out that the Lamb shift leads 
to a further splitting of these two levels, of order a? In(1/a)5mc?. The Lamb shift comes from the 
interaction of the electron with the electromagnetic field (since even a harmonic oscillator in the 
ground state has nonzero expectation value for observables like £*) and a precise derivation of the 
Lamb shift requires QED. 


2.2 Hyperfine splitting 


This was covered in 8.05, but I want to review the derivation and briefly justify one point that we 
previously did not have the tools for. 

The electron and proton are both magnetic dipoles and thus contribute to the Hamiltonian a 
term 


> 


OAyE = —fle j Bproton dipole; 
where the B field coming from the dipole moment of the proton is 


= 1 oe Ss 87 _, 
Breeton dipote = =y (Blip - FF — fp) + = 8 (7), (25) 


14 


The first term has the property that (w|first term|W) = 0 if |~) is an 1 = 0 state (because the 
rotational invariance of |W) means we can replace f;7; with its average over rotations, which is 
$0ij)+ This was discussed in 8.05, but now first-order perturbation theory lets us rigorously justify 
that this means that the first term contributes zero to the energy of | = 0 states at first order in 
perturbation theory. 

The second term is called the “contact term” because of the presence of the delta function. 
Again we can use perturbation theory to obtain that the first-order correction to the wavefunction 


|Wspatial) &® |Pspin) is 


87 _ 2 
Ene = —(Wspatial| ® (Wspin| (Fa. : fips) @) |?)spatial) ® |Pspin) 
GeJp 2m€? ee ~ 
= are 3¢2 (Wspin| Se , SplWspin) (Vspatial | (7) |Wspatial) 


We now should pause to consider good bases. In fact, treating the spatial and spin wavefunctions 
as a tensor product was already an assumption that cannot always be justified, since the fine 
structure wants a basis where j (involving both spatial and electron-spin degrees of freedom) is 
well-defined. However for the n = 1 state of hydrogen, we always have | = 0 and j = 1/2. Thus 
|Wspatial) = |1,0,0) and we obtain a factor of —s from the (Pepatial |O°) (7)|Yspatial) term. 

For the electron and nuclear spins, we have so far a degenerate Hamiltonian. Thus we will 
choose a basis for the spin space that diagonalizes the hyperfine splitting. The eigenvalues of oe S, 
are as with degeneracy 3 (the triplet states) and —#h? (the singlet state). Thus we find that the 
hyperfine splitting is 

AETV Go = Ei oo,triplet Sa" Ei 0.0 sitiglet 
= Le pane = 5.9-10-%eV 
3 Mp 
The wavelength A = #% is 21cm, and radiation at this wavelength plays a central role in radio 
astronomy. 


2.3. Zeeman effect 


We have so far considered internal magnetic fields, but what happens when we apply an external 
magnetic field? Then the contribution to energy from the interaction of this field with the electron 
orbital angular momentum and spin angular momentum is 


= 


dA Zeeman = —( aos aE fis ) . Bext 
—"pt —2upF (0,0,B) 
BB 
eB . he 
= Tie (Lz + 282) using UB = ne 


Because L and S$ are multiplied by different g-factors (1 and ®& 2 respectively), we do not simply 
end up with something that depends on Je Nae result, the states |n,l,j,m,;) that were good 
for the fine structure do not diagonalize 6Hzeeman. One basis that would diagonalize the Zeeman 
Hamiltonian is |n,1,m,ms). However, if we use this, then the fine structure is no longer diagonal! 
This is a fundamental problem: 6Hpg and 6Azeeman do not commute, and thus there is no basis 
that simultaneously diagonalizes them. 


15 


To solve this problem we will use perturbation theory. But which Hamiltonian is the base 
Hamiltonian and which is the perturbation will depend on how strong the magnetic field is. 
2.3.1 Strong-field Zeeman 
If Ezeeman > Eps then we treat Hop + 0A zeeman a8 the base Hamiltonian, use the |n,1,mj,,™ms) basis 
and treat 6Hprg as the perturbation. In this case the energy is 


Enlmm, = En +p B(mi + 2mg) + Epg +... (26) 
ved 


_ a2mec? 
2n2 


To compute the fine-structure contribution we need to evaluate things like (Z- S) and (p*) with 
respect to the states |n,/,mj,ms;). We will not fully carry out this calculation. One example is the 
spin-orbit coupling which is proportional to 


(L- 8) = (Lx)(Sz) + (Ly)(Sy) + (Lz) (Sz) = (Ez)(Sz) = Prams. 


Here we have used the fact that if a state |q) satisfies J,|y) = A|wW) (for any operators Jz, Jy, Jz 
with the appropriate commutation relations for angular momentum) then 


JyJz — JzJ, Jy oe ma 
(d| Jz |b) = oy (| |") = 
and similarly (| J,|y) = 0. 
After some (omitted) calculations we arrive at 
Fs oe a=) if1=0 
nlmims — ) atmec? (3 1i+1)—mims i ‘ (27) 
Ins \ an — WTI) ) Otherwise 


The field strength needed for this is not unreasonable. Since the fine-structure splitting between 
j = 1/2 and j = 3/2 is 5.7-10-°eV and pp = 5.8: 10-°eV/T, we need a field of about one Tesla, 
which is large but achievable. This strong-field case is sometimes called the Paschen-Back effect. 


2.3.2 Weak-field Zeeman effect 


Now we write 
HA = Hj) + 6Aps +6 Ace. (28) 
eS_Y_—-—S 
H&ee 


As we’ve seen before, the eigenbasis of ere is |n,l,j,m,;), so the first-order energy shift when we 
add a magnetic field is 


B F : 
dee = PE (nL, jpmy|Le + 2S,|n, 1, j,mj) 
B 
== (56,9115 Jat Del Nels J 4) 
B ; . 
= uBBm,; a PA (n,1,J,myjl Szln, l,j, mj) 


16 


How do we evaluate this S, term? One method is described in Griffiths, and involves an argument 
about the Heisenberg-picture time average of S under the fine-structure Hamiltonian. A more 
direct method is to use Clebsch-Gordan coefficients. Indeed 


1 1 
EM 3 jl-mj+5 1 1 
Ig = 1 5,mj) =+ ape MS MG 5 es = 5) +1 lm = mj + 5,™Ms = —5) 


(29) 
so if Pr[+] denotes the probability of obtaining outcome +h/2 when measuring the spin, then we 
have 


— 
~ 

€ 
— 

i] 
—_ 
ee 


(S:) = S(Pr(4] - Pr[-) 


wa IEY oily I+ ees — l pe 
mF is ae eS 


_ hm; 
Ss Df ae 
We conclude that ap , 
e 

Eze == ——m,;(14+—— }. 

n,l,j,mj Qmec 4 ( 2+ :) 

—— 

gI 


The term in braces is called the Landé g-factor, after Alfred Landé, who discovered it in 1921. We 
see throughout the spectrum of the hydrogen atom many of the precursors of modern quantum 
theory. 


3 WKB 


3.1 Introduction to WKB 


Perturbation theory covers the case when dHmn is small relative to |E°, — E®|. The Wentzel- 
Kramers-Brillouin (WKB) approximation covers a different limit, when quantum systems are in 
some ways approximately classical. For this reason, it is an example of a semi-classical approxima- 
tion. It will result in powers not of 6H but of fh, and thus becomes exact in the “classical” limit 
h- 0. 

We begin with some exact manipulations of the Schrodinger equation. For a spin-0 mass-m 
particle in 3-d, define 

p(Z,t) = |(#,t)|? probability density 


> 


h 4 
Stet) = = Im(¢* Vw) probability flux 


The flux J has units of probability / area x time and can be seen to be related to momentum as 


follows: h ; 
xb, 
— Im (v 7) 
m h 


— Re (u"BU) 


J (2, t) 


Thus, if we integrate over all Z, we obtain 


[ést= paése Re (v* pw) = Re Pet = a 


m 


17 


One can show (using the Schrédinger equation) the conservation equation 


ae ET = Oh (30) 


We can also re-express the wavefunction using p as 


wz, t) = p(z, je (31) 


with S(Z,t) real so that the exponent contributes a pure phase. What does it mean physically? In 
terms of p,.S we can compute 


1 
VV = avert ZPVS 
ee 
real imaginary 
Say * => p => 
t) = —I = —— 
F@,t) = * myrWw) = “295 
Thus we obtain a physical interpretation for the phase. 
= VS 
r,t) = p—. 2 
F@,t) =p (32) 


“re 
U 


Namely its gradient relates to the probability flux. If J = p then VS x p. These equivalences 
are pretty loose, but we will build on this intuition as we proceed. For now, observe that for a free 
particle, they indeed give the right idea: 
a _ wz _ ips 
Wtree(Z, t) ee ih ry 
So S = p- x — Et and we have VS = p exactly. We also have V2S = 0, and it will turn out later 
that this quantity will measure how “non-plane-wave-like” our wavefunction is. 


Schrédinger equation for a general 1-D potential Consider a region where V(x) < FE (called 
“classically allowed”). Then 


2 32 
FTE = (B-Vea)ola) 
2 
=e” onl V Ula) 


dx? eS 
p? (a) 


We can interpret p(x) as a classical momentum. The solution of this corresponds (roughly) to os- 


cillations with period A(x) = a which is the De Broglie wavelength corresponding to momentum 
p(2). 
What if V(x) > E? These are called “classically forbidden.” Then we get 


corresponding to solutions that exponentially decay at rate K(x). 

The claims of “oscillating with period \(x)” or “decaying at rate K()” are only rigorous when 
p(x) or K(x) are independent of x. But we will see how they can be good approximations even 
when p(x) or K(x) are merely slowly varying with x. 

Let’s write w(x) = exp(iS(x)/h), for some S with units of angular momentum. Since S can be 
complex, this is without loss of generality. 

Substituting (in the classically allowed region) we get 


” 


iS (a) d? is(z) 
pr(ajer ==0 . 
7 
x 
sv S! 2 iS(a 
== (iG) OO) oe 
Thee * terms drop out and we obtain 
(3')? — ihS" = p*(a) (33) 


So far this is exact. But if the potential is slowly varying, then p(z) is slowly varying, and the —ihS” 
term will be small. What does “slowly varying” mean? The wavelength \(x) = ork should be small 
relative to the variation of p(x). In the classical limit h + 0 we have A(x) > 0. Thus it makes 
sense to expand around this limit in powers of A, in a way analogous to our perturbation-theory 
strategy of expanding in terms of the perturbation. Thus we write 


S(x) = So(x) +ASi(x) +h?So(x) +... 
—_o--——" 
WKB approximation 


and will take only the first two terms to be the WKB approximation. 
Let us now substitute S(x) = So(x) + ASi(x) into (33). We obtain 


(Si 8S))"? = ths, — 1 5S, Ser) 
(56)? + 2hS5S) — ihSp + h?(($1)? — iS!) = p? (2) 


We want to equate powers of fh . Treat fh here as a former parameter, and we obtain 


(S)” = p*(a) at O(h”) (34a) 
25) 5,—75, =0 at, O(h') (34b) 
From (34a), we obtain S§ = +p(x), which we can solve to obtain 


where 2p is arbitrary. Substituting S) = +p(x),.5 = +p’(«) into (34b) we find 


19 


This has solution 


Substituting into our equation for w(x) we get 


; : , 1 
w(x) = eS (x)/h ~e £(So+hS1) _ = exp (: aa p(e!\ae' exp (-5 Inp(z) fhe c) 


Thus in the classically allowed regions we have the solution 


Hae sow @ [ p(e!\ae' | 7 ; exp ( : | p(e!\ae' (36) 


with p(x) = /2m(V — E(a The solution in the classically forbidden regions is the same but 
with p(x) = ik(z). ae aes to 


wena Le) + gee a fe) (37) 


3.2. Validity of WKB 


v(x) = 


Let’s just look at the first part of the classically allowed solution: ~(x) = ae exp € a p(x" )dx’ i 


2 2 
The probability density p(x) = |~(x)|? = er = rae where u(x) can be thought of as a classical 
velocity. This makes sense because it says that the particle spends less time in regions where it is 
moving faster. 


Another check is to look at the probability current: J = zie Im (p* ZY). We calculate 


»__ipz), t 
U = Fy + pow 
wy — 2 P')) 2, PR), 12 
vid = STW tl 
jE a2 D apts 


Next, let’s look at the first discarded term. Our approximation assumed that |h?(S)?| < 
|AS$S{|, or equivalently, |AS}| << |.S6|. In terms of p(x) this condition states that 


A) _la.a(a)l. (38) 


/ / 
ne < |p) — i> |r5 
Pp Pp 


In other words, the de Broglie wavelength should be slowly varying. How slowly? Return to 
le: ne, . Rearranging, we obtain 


dp 
dx 


Ay, 
pl > —|p'| = A(z 
Ip| al” (x) 


In other words, the change of p over a de Broglie wavelength should be < |p]. 


20 


Figure 1: Example of a potential V(x) that is finite only in the interval 0 < x < a. 


We can relate this back to the potential energy. 


p’ =2m(E-V) 
|2pp'| = 2m|V"| 


hlp'| = m"|V"| = mX(2)|V" 
Pp 


hip'| < |p|? from (38) 
mX(x)|V"| < |p|? 


2 
A(x)|V"]| « pr 
m 

The change of potential energy over one wavelength should be < the kinetic energy. 


3.3. Bohr-Sommerfeld quantization 


The WKB approximation can be used to generalize the old idea that quantum “orbits” should have 
action (integral of pdx) that is an integer multiple of h = 27h. This idea is part of what is called 
“old quantum mechanics” because it predates the modern (ca. 1925) formulation in terms of the 
Schrodinger equation. 

We illustrate this with an example. Consider a potential V(x) such that V(a) = oo for x < 0 
or « > aand V(z) is finite for 0 < x <a. This is depicted in Fig. 1. 

Assume that E > V(x) for all 0 <a <a. Then the solution has the form 


we) ey (eo [at va [ne 


1 to(x —ip(x = I ° i 
= Sirs (eit) + Berit) Ala) = 5 f wla!yae 
eee (C cos(¢(x)) + Dsin(¢(a))) 

p(x) 


From the boundary condition (0) = 0 and the fact that ¢(0) = 0 we obtain C = 0. From 
the condition w(a) = 0 we find that ¢(a) = nz. Plugging in the definition of ¢(a) we find the 
quantization condition: 


[ dx,/2m(En — V(x)) = nq, (39) 


0 
where we have defined E, to be the n*” energy level. 


21 


As a sanity check, if V(2) = 0 then (39) yields 


1 2,272 
Rov 2mEn =n => En, = a, (40) 


2ma? 


What if we have “soft” walls? For this we need to connect the oscillating solutions in the allowed 
regions with the decaying solutions in the forbidden regions. This is achieved by the connection 
formulae. 

To see the need for this, let’s examine the integral of p(x) over the classically allowed region of 


a harmonic oscillator. 

al 1 2.2 

A= — 4+ =mw* 2". 
2m 2 


The energy levels are of course Ey, = hw(n+1/2). The turning points are given by solution to the 
equation 4mw?x? = hw(n + 1/2). These are 


h 
x=tl/V2n+1 l= —. (41) 


mw 


Let’s see what happens when we integrate p(x)/h between these turning points. We obtain 


1 L/2n4+1 1 l/2n4+1 1 
=f p(x)dx = =f [2m h(n + 1/2) — =mw?x?)dx 
A J _1/aert A J_i/mmF1 2 


L/In-+F1 2 
= / 1 an ay ees = dex 
-iVanti ! l 
1 
= (2n+ 1) / V1—u?du defining u =I1V/2n+1 
=a 


=a(n+1/2). 


There is an extra factor of 7/2 relative to what happens with hard walls. We will see below why 
this is. 

Before continuing, we can see that the WKB approximation does give a pretty accurate picture 
of the harmonic oscillator. In the forbidden region (|x| > 1) they describe the wavefunction as 
exponentially decaying. Specifically suppose that |z| >> 1 so that K(x) = \/2m(V(2)— E,) & 

2mV (x) = mw|z|. Integrating this we get 


p(x) x exp(—j—-) =e 2?, (42) 


which gives the correct rate of exponential decay. 
In the classically allowed region WKB also correctly predicts the number of oscillations of the 
wavefunction. But how can WKB predict the m(n + 1/2) result we found above? 


3.4 Connection formulae 


When FE = V(a) then we say that a is a “turning point”. Turning points separate allowed from 
forbidden regions, and therefore oscillating from decaying solutions. However, near a turning point, 
the WKB approximation breaks down. So if we want to glue together oscillating and decaying 
solutions, we cannot just match boundary conditions at the border. Something nontrivial will 
happen at the turning point, which could involve reflection/transmission as well as phase shifts. 


22 


There are two approaches to this, both difficult. One that we will not explore is to use complex 
analysis and analytically continue the wavefunction to complex-valued x. In this way we can avoid 
going near a: x goes up to a — € along the real line, then follows a half-circle in the complex plane 
to a+ and then continues along the real line. 

Instead we will follow Griffiths and use Airy functions. Near a turning point, we can approximate 
V(x) = V(a) +(a—2x)V'(a)+.... To simplify notation shift the origin and the overall energy level 
so that a = 0 and F = V(0) = 0. In the vicinity of the turning point, the Schrédinger equation 
looks like 


2 
— 5 y"(z) + aV"(O)w(a) = 0 (43) 
o"(a) = EO eu) (44) 
3 


We define a in this way so that when we change coordinates to z = ax then z is dimensionless and 
we obtain the dimensionless equation 


"(z) = zap(z). (45) 
This is a second-order differential equation and thus has a two-dimensional space of solutions. 
These are called the Airy function Ai(z) and the Airy function of the second kind Bi(z). There 
are exact expressions for these that are somewhat unilluminating (see Griffiths or wikipedia for 


details), but what will be more useful are the asymptotic formulas. These are oscillatory for z < 0 
and exponentially decaying or growing for z > 0. Specifically: 


Ai(z) Bi(z) 

2<0 Faeyr sin(3(-2)?? +7) Gece 008 (5(-2)*? + F) 
2,3 2.3 

z>0 STF air 


We can then match up the Airy function near the turning point with the decaying and oscillating 
solutions that are valid far from the turning point. This yield connection formulas. The high-level 
picture is that in the classically allowed region we have two oscillating solutions, near the turning 
point we have two Airy functions as solutions and in the forbidden region we have two exponentially 
decaying/growing solutions. At each boundary we have two constraints given by continuity of (2) 
and w/(x). These give the following connection formulae: 


Allowed on the left, forbidden on the right Consider the turning point depicted in Fig. 2(a). 
Then we find 


Hoon hemes) non mend) ce 


7 exp ( ef x(u!)ae' + 7 exp G [ x(a!) za (46b) 


A few words of caution. If B # 0 then in the x >> a we might be tempted to neglect the A 
term. But this will give a bad error in the x < a region. 

Conversely if |B| < |A], then we have to be careful about neglecting it in the x > a region 
because it can become dominant for large x. 


23 


Figure 2: Turning points. In (a) the classically allowed region is on the left and the classically 
forbidden region is on the right. In (b) the classically allowed region is on the right and the 
classically forbidden region is on the left. 


Allowed on the right, forbidden on the left Suppose the turning point is at « = b, as 
depicted in Fig. 2(b). Now the connection formula is 


— exp ( tf w(2)ae!) + = exp (; [stay py (47a) 
a sin G [ p(2’)da! + 7 a cos (; [ p(a")da! + = e>b (47) 


Application to harmonic oscillator Applying these to the harmonic oscillator predicts that 
(calculation omitted - see Griffiths) 


7 p(x)dx = (n — 1/2)n, 


= 
where +20 are the turning points. Here we take n = 1,2,3,..., which is why it matches what we 
observed above (where n = 0,1,2,... is from the conventional way to label the harmonic oscillator 


energy levels). To summarize, the Bohr-Sommerfeld quantization condition for + ‘i p(x)dx (where 


two hard walls nt 
[a, b] is the classically allowed region) is one hard wall, one soft wall = (n — 1/4) 
two soft walls (n—1/2)a 


3.5 Tunneling 


We can also use WKB to estimate the rate at which a particle will “tunnel” through a classically 
forbidden region, as depicted in Fig. 3(a). This is useful for modeling phenomena such as radioactive 
decay. 7 

The transmission probability is 


This approximation is valid if the barrier is broad and high. 


24 


Figure 3: (a) Tunneling through a classically forbidden region. (b) A particle localized in the region 
a<a« <b will eventually tunnel into the region x > c. 


How do we get a lifetime from this? Suppose that the particle is localized in the region a < x < 6 
in Fig. 3(b). Then we can approximate 


1 1 a ee: 


tunnel prob per unit time J -# hits per unit time — Tt —“p 


lifetime = 


where 7 is the period of oscillation within the region a < x < b. This can be approximated by 


Putting this together we can approximate the lifetime as 


b é 
dx 2 
lifetime ~ 2 f m— + ex iz | (ede. 
oe 


The fact that lifetime scales exponentially with barrier height explains the vast differences we 
see in alpha decay. The halflife at which 7°°U decays to 2°4Th is 4.5 billion years while the halflife 
for the decay of 7!4Po to 7!°T] is 0.164ms; see Fig. 4 for the full decay chain. This difference is due 
to the difference in barrier height relative to the energy of the ejected alpha particle. 


016% 934 8- 


Pa—> 
_ 91 
2: 1,17 min 6,7 h 23 
pd UI fo Th ana Gana *92U og Th as Ra oR 
4468x109 y eee 2,445x105 y 7,7x104 y 
1,17 min 
228 a 218 a ‘ B- ‘ p B- a ‘ s- ‘ s. 87 a ‘ 
722Ry—=_+718Po PPh PBI Pipo—_& _,2ppy_F _, 210i _F_,210pg_& _, 6p 
3,8235 d 3,05 min 26,8 min 19,9 min 164.3 pus 22,26 y 5,013 d 138,38 d 


Figure 4: Decay chain of 2°°U. From wikipedia article on Uranium-238. 


25 


8.06 Spring 2016 Lecture Notes 


2. Time-dependent approximation methods 


Aram Harrow 


Last updated: March 12, 2016 


Contents 

1 Time-dependent perturbation theory 1 
ld Rotating fame . 2.2.4 ee ee he he EM HME SE eR SEE Re eae ESE AGS 2 
2 Peer betiot OxpRBeIOM, 6k se ak OR ee a ee 2 Be we we dw Pd 3 
ee ea 8g ee eh aiceus he eh res ede  tee By Bee, Seo Sees 4 
4. Pemouie Pern eione oo. he hg 8 does er he ae we, oe OR we 5 

2 Light and atoms 8 
2.1 Incoherent light. ¢. 14:4 2048 See a ea eR Ee ee a eee a dS 9 
2.2 SpoMtaneGus SWUSSION . nk ek eR a aE SO aw RR we 10 
2.6 ‘The phowclecivic elect . 6 6 he eee he Ge RE EER A eee 12 

3 Adiabatic evolution 15 
3.1 ‘The adiabatic approximation 2.2.6.8 ee bt eee eae ee ewe A 15 
a2 Berry pldse ..4.4255 25 ¢ 2248404604 ade eee ee RE Ee Ree eG e he 18 
3.3 Neutrino oscillations and the MSW effect ............ 0.020000 00004 21 
3.4 Born-Oppenheimer approximation ... 2... 0... 0 ee ee 23 

4 Scattering 24 
AA Preniigfies . 2c ee bi eo Re ee Re ea ee eae ee ee des 24 
42 Born Approximation oo. 64. eee ao ee whee ag ewe we 28 
AS Partial Waves 4 224 2s he ENS ee eR ea EGR RL RR RR we ee 31 


1 Time-dependent perturbation theory 


Perturbation theory can also be used to analyze the case when we have a large static Hamiltonian 
Ho and a small, possibly time-dependent, perturbation dH (t). In other words 


H(t) = Hy + 6H(t). (1) 


However, the more important difference from time-independent perturbation theory is in our goals: 
we will seek to analyze the dynamics of the wavefunction (i.e. find |y(t)) as a function of t) rather 
than computing the spectrum of H. In fact, when we use a basis, we will work in the eigenbasis of 
Ho. For example, one common situation that we will analyze is that we start in an eigenstate of 
Ho, temporarily turn on a perturbation 6H(t) and then measure in the eigenbasis of Hp. This is a 
bit abstract, so here is a more concrete version of the example. Ho is the natural Hamiltonian of 


the hydrogen atom and 6H(t) comes from electric and/or magnetic fields that we temporarily turn 
on. If we start in the 1s state, then what is the probability that after some time we will be in the 
2p state? (Note that the very definition of the states depends on Ho and not the perturbation.) 
Time-dependent perturbation theory will equip us to answer these questions. 


1.1 Rotating frame 


We want to solve the time-dependent Schrédinger equation ihd;|y(t)) = H(t)|). We will assume 
that the dynamics of Ho are simple to compute and that the computational difficulty comes from 
OH (t). At the same time, if Hp is much larger than 6H(t) then most of the change in the state will 
come from Hg. In classical dynamics when an object is undergoing two different types of motion, 
it is often useful to perform a change of coordinates to eliminate one of them. We will do the same 
thing here. Define the state 


- iHigt 
|b(t)) =e * |p(Z)). (2) 
We say that |7)(t)) is in the rotating frame or alternatively the interaction picture. Multiplying by 
itHot 


en cancels out the natural time evolution of Ho. In particular, if )H(t) = 0 then we would have 
\W(t)) = |e(0)) = |(0)). Thus, any change in |7(t)) must come from 6H (t). 


Aside: comparison to Schr6dinger and Heisenberg pictures. In 8.05 we saw the Schrodinger 
picture and the Heisenberg picture. In the former, states evolve according to H and operators re- 
main the same; in the latter, states stay the same and operators evolve according to H. The 
interaction picture can be thought of as intermediate between these two. We pick a frame rotating 
with Ho, which means that the operators evolve according to Ho and the states evolve with the 
remaining piece of the Hamiltonian, namely 6H. As we will see below, to calculate this evolution 
correctly we need 6H to rotate with Ho, just like all other operators. This is a little vague but 
below we will perform an exact calculation to demonstrate what happens. 
Now let’s compute the time evolution of |y(t)). 


d+ a (_itot 
ih l(t) = ths (c h w(t))) 


—Hye® |(t)) te ® (Ho + 6H(t)) [v(t)) 


iHot tHot 
=e h dH(t)|v(t)) since Hp and e * commute 
iHot iHot ~ 
=e} b6H(t)e * |v(t)) 
a 
5H (t) 
Thus we obtain an effective Schrodinger equation in the rotating frame 
ee OL ee are ~ 
the |W) = SAO) (3) 


where we have defined ai 
2410 


—_ tHot 
dH(t)=e *% dH(t)e % . 


This has a simple interpretation as a matrix. Suppose that the eigenvalues and eigenvectors of Ho 
(reminder: we work with the eigenbasis of Ho and not H(t)) are given by 


Ho|n) = E,,|n). 


Define 6Hmn(t) = (m|OH(t)|n). Then 


(Em eee 


—_ iHgt iHot cos ce t 
JH (t) =(mnle s bH(he = |i) =e Donat) =e "0 Laas 


where we have defined Wmn = Baste If we define c,,(t) according to 
_ iEnt 
t)=Soainy => (bO®) =e alt)|n 


then we obtain the following coupled differential equations for the {c,}. 


Rem (t Poe” 0H liegt) = S- ebro § Fn Cy (t). 


1.2 Perturbation expansion 


So far everything has been exact, although sometimes this is already enough to solve interesting 
problems. But often we will need approximate solutions. So assume that 6H (t) = O(A) and expand 
the wavefunction in powers of 4, i.e. 


en(t) = Cit + Ma + Bw +t... 
O(1) O(A) O(’) 


Bt) = WOW) + WOO) + WO) +... 


We can solve these order by order. Applying (3) we obtain 


ihOs| PO (t)) + sh, PO )) + shI,|YO OH) +... = 6H (Od (H) + FAH HOH) +... (4) 
Oe eee SS OS 
O(1) O()) O(A2) O()) O(d2) 


The solution is much simpler than in the time-dependent case. There is no zeroth order term on 
the RHS, so the zeroth order approximation is simply that nothing happens: 


|p (#)) = |b (0)) = |b(0)) (5) 
The first-order terms yield 
ihd,|b (t)) = 6H ()|H (t)) = FH(A)|V(0)). (6) 
Integrating, we find 
BOL) y= fae Oy 15(0)) | (7) 


This leads to one very useful formula. If we start in state |n), turn on A(t) for times 0 << t < T 
and then measure in the energy eigenbasis of Ho, then we find that the probability of ending in 


state |m) is 
— 
[ dt! 6 Hmn (t!)emnt 
, ih 


We can also continue to higher orders. The second-order solution is 


i) = fae [ar OHO yoy | (8) 


Pasm = 


1.3 NMR 


In some cases the rotating frame already helps us solve nontrivial problems exactly without going 
to perturbation theory. Suppose we have a single spin-1/2 particle in a magnetic field pointing in 
the 2 direction. This field corresponds to a Hamiltonian 


h 
Hop = w0Sz = 5 W0ee- 
If the particle is a proton (i.e. hydrogen nucleus) and the field is typical for NMR, then wo might 
be around 500 MHz. 


Static magnetic field Now let’s add a perturbation consisting of a magnetic field in the # 
direction. First we will consider the static perturbation 


5H (t) = QSp, 


where we will assume 2 < wo, e.g. 2 might be on the order of 20 KHz. (Why are we considering a 
time-independent Hamiltonian with time-dependent perturbation theory? Because really it is the 
time-dependence of the state and not the Hamiltonian that we are after.) 

We can solve this problem exactly without using any tools of perturbation theory, but it will be 
instructive to compare the exact answer with the approximate one. The exact evolution is given 


by precession about the 
Woz + OF 


we +? 
axis at an angular frequency of Jwe +02, If Q < wo then this is very close to precession around 


the 2 axis. 
Now let’s look at this problem using first-order perturbation theory. 


wot 


27? =) (cos(wot)Sz — sin(wot)S,,) 


eH) = [ a ZO wo 


5H (t) = ef 2 7=0S,.e77 


= il ato (cos(wot’) Sz ee sin(wot’) Sy) 1(0)) 
0 


= a (sin(wot)Sz + (cos(wot) — 1)S,) |¢b(0)) 
ih wo 

We see that the total change is proportional to Q/wo, which is < 1. Since this is the difference 
between pure rotating around the 2 axis, this is consistent with the exact answer we obtained. 

The result of this calculation is that if we have a strong Z field, then adding a weak static < 
field doesn’t do very much. If we want to have a significant effect on the state, we will need to do 
something else. The rotating-frame picture suggests the answer: the perturbation should rotate 
along with the frame, so that in the rotating frame it appears to be static. 


Rotating magnetic field Now suppose we apply the perturbation 


JH (t) = Q (cos(wot)Sz + sin(wot)Sy) . 


We have already computed S,, above. In the rotating frame we have 


a 


2 = (cos(wot)Sz — sin(wot)Sy) 
y = (cos(wot)S, + sin(wot)Sz) 


a 


Thus — 
6H(t) = OS,. 


The rotating-frame solution is now very simple: 


This can be easily translated back into the stationary frame to obtain 


iwot iQt 
5) oO 


|(t)) =e 2 2 7 |h(0)). 


1.4 Periodic perturbations 


The NMR example suggests that transitions between eigenstates of Ho happens most effectively 
when the perturbation rotates at the frequency wm,,. We will show that this holds more generally 
in first-order perturbation theory. Suppose that 


OH(t) = V cos(wt), 


for some time-independent operator V. If our system starts in state |n) then at time t we can 
calculate 


7 t 
c(t) = (mid) = | dt 
0 
= is dt! OA walt } civmnt! 
F ih 


peng A ' 
= i dt! Tee cos(wt)e@mnt 
0 4 


2ih 
= Vinn 
Dh 


_ Vinn ip dt! (ches + ellen —w)t) 


Wmn + W Wmn — W 


ei(Ymntw)t at el(Wmn—w)t ng J 


The wmn— w terms in the denominator mean that we will get the largest contribution when 
Ww & |Wmn|. (A word about signs. By convention we have w > 0, but Wmn is a difference of energies 
and so can have either sign.) For concreteness, let’s suppose that w ¥ Wmn; the w ¥ —Wmn case is 
similar. Then we have 

View ei(Wmn—w)t nf 


2th Wmn — Ww 


If we now measure, then the probability of obtaining outcome m is 


i, (Wmn—w)t ‘ a 
P, ~w [PArpj2 — [Vian|? 8 ( 2 ) _ [Vinn|? sin? (¥) 
non(t) = [ei (QP = te OE 


where we have defined the detuning a = Wmyn —w. The t,a dependence is rather subtle, so we 
examine it separately. Define a 
: (a4 
f(t,a) = 4) (9) 
a 
For fixed a, f(t, a) is periodic in t. 
It is more interesting to consider the case of fixed t, for which f has a sinc-like appearance (see 
Fig. 1). 


transition amplitude for t = 1 


0.25 T T T T 


0.2 


sin? (ta/2) 
a? 


° 
a 


° 


f(t, a) 


° 
2 
a 


Figure 1: The function f(a, t) from (9), representing the amount of amplitude transfered at a fixed 


time t as we vary the detuning @ = Wmn — w. 


It has zeroes as a = 2rn for integers n # 0. Since the closest zeros to the origin are at +277/t, 
we call the region a € [—27/t,27/t] the “peak” and the rest of the real line (i.e. Jal > %*) the 
“tails.” For a — oo, f(t,a) < 1/a?. Thus, the tail has total area bounded by 2 Son/t ior =O). 

For the peak, as a + 0, f(t,a) — t?/4. On the other hand, sin is concave, so for 0 < 6 < 1/2 
we have sin(6) > neg = 29. Thus for |a| < 7 we have f(a,t) > e While these crude bounds 
do not determine the precise multiplicative constants, this does show that there is a region of width 
~ 1/t and height ~ t?, and so the peak also has area O(t). 

We conclude that [°° da f(t, a) ~ t. Dividing by t, we obtain 


fo aoFhO a1. 


On the other hand, fee) —+ O0ast— oo for alla 40. So as t > ow we see that fGe) is always 


nonnegative, always has total mass roughly independent of t, but approaches zero for all nonzero 
a. This means that it approaches a delta function. A more detailed calculation (omitted, but it 
uses complex analysis) shows that 


re) a, 


t-00 t 2 


The reason to divide by t is that this identifies the rate of transitions per unit time. Define 
Riesn = Prom. Then the above arguments imply that 


7 Won 
Rnom © 2 fe 6(|Wmn| — w) for large t. (10) 


Linewidth In practice the frequency dependence is not a delta function. The term “linewidth” 
refers to the width of the region of w that drives a transition; more concretely, FWHM stands 
for “full-width half-maximum” and denotes the width of the region that achieves > 1/2 the peak 
transition rate. The above derivation already suggests some reasons for nonzero linewidth. 


1. Finite lifetime. If we apply the perturbation for a limited amount of time, or if the state we 
are driving to/from has finite lifetime, then this will contribute linewidth on the order of 1/t. 


2. Power broadening. If |Vinn| is large, then we will still see transitions for larger values of |a]. 
For this to prevent us from seeing the precise location of a peak, we need also the phenomenon 
of saturation in which transition rates all look the same above some threshold. (For example, 
we might observe the fraction of a beam that is absorbed by some sample, and by definition 
this cannot go above 1.) 


There are many other sources of linewidth. In general we can think of both the driving frequency 
w and the gap frequency wy, as being distributions rather than 6 functions. The driving frequency 
might come from a thermal distribution or a laser, both of which output a distribution of frequen- 
cies. The linewidth of a laser is much lower but still nonzero. The energy difference hw,» seems 
like a universal constant, can also be replaced by a distribution by phenomena such as Doppler 
broadening, in which the thermal motion of an atom will redshift or blueshift the incident light. 
This is just one example of a more general phenomenon in which interactions with other degrees of 
freedom can add to the linewidth; e.g. consider the hyperfine splitting, which measures the small 
shifts in an electron’s energy from its interaction with the nuclear spin. This can be thought of as 
adding to linewidth in two different, roughly equivalent, ways. We might think of the nuclear spins 
as random and thus the interaction adds a random term to the electon’s Hamiltonian. Alterna- 
tively, we might view the interaction with the nuclear spin as a source of decoherence and thus as 
contributing to the finite lifetime of the electron’s excited state. We will not explore those issues 
further here. 

The other contribution to the rate is the matrix element |Vin,|. This depends not only on the 
strength of the perturbation, but also expresses the important point that we only see transitions 
from n > mif Vin 4 0. This is called a selection rule. In Griffiths it is proved that transitions from 
electric fields (see the next section) from Hydrogen state |n,1,m) to |n’,l',m’) are only possible 
when |/ — l/| = 1 and |m — m’| < 1 (among other restrictions). Technically these constraints hold 
only for first-order perturbation theory, but still selection rules are important, since they tell us 
when we need to go to higher-order perturbation theory to see transitions (known as “forbidden 
transitions” ). In those cases transition rates are much lower. One dramatic example is that 2p + 1s 
transition in hydrogen takes 1.6ns because it occurs at first order while the 2s > 1s transition takes 
0.12 seconds. For this reason states such as the 2s states are called “metastable.” 

We now consider the most important special case, which gets its own top-level section, despite 
being an example of a periodic perturbation, which itself is an example of first-order perturbation 
theory. 


2 Light and atoms 


Light consists of oscillating E and B fields. The effects of the B fields are weaker by a factor 
O(v/c) ~ a, so we will focus on the E fields. Let 


E(¥) = Eo2cos(wt — ka). 


However, optical wavelengths are 4000-8000A, while the Bohr radius is ¥ 0.5A, so to leading order 
we can neglect the x dependence. Thus we approximate 


O0H(t) = eo z cos(wt). (11) 


We now can apply the results on transition rates from the last section with Vinn = eEo(m|z|n). 
(This term is responsible for selection rules and for the role of polarization.) Thus the rate of 


transitions is soy 
nT eH 
Rn—sm = 2 .eF [(m|z|n)|?5(|wmn| = Ww). (12) 


We get contributions at Wmn = +w corresponding to both absorption and stimulated emission. 


Aside: quantizing light What about spontaneous emission? This does not appear in the 
semiclassical treatment we’ve described here. Nor do the photons. “Absorption” means jumping 
from a low-energy state to a higher-energy state, and “stimulated emission” means jumping from 
high energy to low energy. In the former case, we reason from energy conservation that a photon 
must have been absorbed, and in the latter, a photon must have been emitted. However, these 
arguments are rather indirect. A much more direct explanation of what happens to the photon 
comes from a more fully quantum treatment. This also yields the phenomenon of spontaneous 
emission. Recall from 8.05 that oscillating electromagnetic fields can be quantized as follows: 


2 
Eo = €9(4+ 4!) f= mols (Gaussian units) = 4/ ee (SI units) 
V E9V 


Using 6H = eEoz, we obtain 
6H = eEoz® (4+4'). 


0 a 
If we look at the action of z in the {1s, 2p,} basis, then it has the form with a = (1s|z|2p-,). 
0 


We then obtain the form of the Hamiltonian examined on pset 3. 

This perspective also can be used to give a partial derivation of the Lamb shift, which can be 
thought of as the interaction of the electron with fluctuations in the electric field of the vacuum. 
In the vacuum (i.e. ground state of the photon field) we have (Eo) ~ (@+ 4!) = 0 but (E32) ~ 
((4+4!)?) > 0. These vacuum fluctuations lead to a small separation in energy between the 2s 
and 2p levels of hydrogen. 


Dipole moment In the Stark effect we looked at the interaction of the hydrogen atom with an 
electric field. This was a special case of the interaction between a E field and the dipole moment 
of a collection of particles. Here we discuss the more general case. 

Suppose that we have charges gi,...,gn at positions #,...,2%), and we apply an electric 
field E(Z). The energy is determined by the scalar potential ¢(Z) which is related to the electric 


N) 


=> 


field by E = —Vo. If E(Z) = E (ie. independent of position Z) then one possible solution is 


> 


o(@) = —#- E. In this case the Hamiltonian will be 
N . N . 
H=)- 4gid(@) =-S ga - B=-d- B 
i=1 i=1 


where we have defined the dipole moment d= Ae gz. Our choice of ¢ was not unique, and 
we could have chosen ¢(Z%) = C — z- E for any constant C’.. However, this would only have added 
an overall constant to the Hamiltonian, which would have no physical effect. 

What if the electric field is spatially varying? If this spatial variation is small and we are near 
the origin, we use the first few terms of the Taylor expansion to approximate the field: 


oe - OF; . 
E(#) = E(0)+ > Ay Citi too 
This corresponds to a scalar potential of the form 
3 3 
1 OF; 


i,j=l 


For the quadratic terms we see that the field couples not to the dipole moment, but to the quadrupole 
moment, defined to be ae Gz @£. This is related to emission lines such as 1s + 3d in which 
£ may change by up to +2. Of course higher moments such as octupole moments can be also be 
considered. We will not explore these topics further in 8.06. 


2.1 Incoherent light 


While we have so far discussed monochromatic light with a definite polarization, it is easier to 
produce light with a wide range of frequencies and with random polarization. To analyze the rate 
of transitions this causes we will average (12) over frequencies and polarizations. 

Begin with polarizations. Instead of the field being Ep, let the electric field be EyP for some 
random unit vector P. We then replace V with —EoP - d. The only operator here is the dipole 
moment d = (d1, dz, dg), so the matrix elements of V are given by 


3 
i=1 


Since the transition rate depends on |V;»|2, we will average this quantity over the choice of polar- 


ization. Denote the average over all unit vectors P by (-) Pp 


(lVinnl”) p = EG (\P : dinn|?) 
3 


= EY) ((m|Pidiln) (n|Pjdjlm)) p 
ij=l 


P 


3 
= Ej Ss" (PiP}) p(m|di|n) (n\d;|m) 


ij=l 
3 
di 
= S- 3 (mldi|n) (n|d;|m) explained below 
ij=l 


How did we calculate (P;P;)? This can be done by explicit calculation, but it is easier to use sym- 
metry. First, observe that the uniform distribution over unit vectors is invariant under reflection. 
Thus, if i A j, then (P;P;) = ((—P;)P;) = 0. On the other hand rotation symmetry means that 
(P?) should be independent of i. Since P? + P3 + P? = 1, we also have (P? + P3 + P?) = 1 and 
thus (P?) = 1/3. Putting this together we obtain 


(P:P)) p = 8. (13) 


Next, we would like to average over different frequencies. The energy density of an electric field 

2, 
is U = ra (using Gaussian units). Define U(w) to be the energy density at frequency w, so that 
U = { U(w) dw. If we consider light with this power spectrum, then we should integrate the rate 


times this distribution over U(w) to obtain 


An? a 2 
Room = f de) spglEnnl?5( — lem) 


An? 
= app Gnnl U (lemnl) 


This last expression is known as Fermi’s Golden Rule. (It was discovered by Dirac, but Fermi called 
it “Goldren Rule #2”.) 


2.2 Spontaneous emission 


The modern description of spontaneous emission requires QED, but the first derivation of it predates 
even modern quantum mechanics! In a simple and elegant argument, Einstein: 


(a) derived an exact relation between rates of spontaneous emission, stimulated emission and 
absorption; and 


(b) proposed the phenomenon of stimulated emission, which was not observed until 1960. 


10 


He did this in 1917, more than a decade before even the Schrédinger equation! 

Here we will reproduce that argument. It assumes a collection of atoms that can be in either 
state a or state b. Suppose that there are Nz atoms in state a and Ny atoms in state 6, and that the 
states have energies Eg, Ey with Ey > Eq. Define wg = Py— Ka and 8 = 1/kgT. Assume further 
that the atoms are in contact with a bath of photons and that the entire system is in thermal 


equilibrium with temperature JT. From this we can deduce three facts: 


Fact 1. Equilibrium means no change: Nj = N, = 0. 


Fact 2. At thermal equilibrium we have = = oe = Pita, 


Fact 3. At thermal equilibrium the black-body radiation spectrum is 


h w? 


Uw) = Fag gio 1 


(14) 


We would like to understand the following processes: 
Process Explanation Rate 


Absorption A photon of frequency wy, is absorbed and an BapNaU (wa) 
atom changes from state a to state b. 


Spontaneous A photon of frequency wp, is emitted and an atom AN, 
emission changes from state b to state a. 


Stimulated A photon of frequency wp, is emitted and an atom BoaNoU (wa) 
emission changes from state 6 to state a. 


These processes depend on the Einstein coefficients A, B,, and By, for spontaneous emission, 
absorption and stimulated emission respectively. They also depend on the populations of atoms 
and/or photons that they involve; e.g. absorption requires an atom in state a and a photon of 
frequency Wpq, so its rate is proportional to NjU(wpa). Here it is safe to posit the existence of 
stimulated emission because we have not assumed that B,, is nonzero. 

Having set up the problem, we are now almost done! Adding these processes up, we get 


N, = —N,A—- No Boa (wa) + NN BabU (che): (15) 


From Fact 1, N, = 0 and so we can rearrange (15) to obtain 


A Fact 2 A Fact 3 Aw? 1 


U (wba) = = i 
Na hwoa = 203 eBhwoa — 
Ht Bab — Boa ePhwra Bay — Boa m2¢3 EPhwya — 1 


Since this relation should hold for all values of G, we can equate coefficients and find 


Bap = Boa (16a) 
hw? 
— = Ba (16b) 


We see that these three processes are tightly related! All from a simple thought experiment, and 
not even the one that Einstein is most famous for. 

Today we can understand this as the fact that the electric field enters into the Hamiltonian as 
a Hermitian operator proportional to @+ 4’, and so the photon-destroying processes containing 4 


11 


are inevitably accompanied by photon-creating processes containing @'. Additionally the relation 
between spontaneous and stimulated emission can be seen in the fact that both involve an al 
operator acting on the photon field. If there are no photons, then the field is in state |0) and we 
get the term a!|0) = |1), corresponding to spontaneous emission. If there are already n photon in 
the mode, then we get the term a'|n) = Vn +1|n +1). Since the probabilities are the square of 
the amplitudes, this means that we see photons emitted at n+ 1 times the spontaneous emission 
rate. In Einstein’s terminology, the n here is from stimulated emission and the +1 is from the 
spontaneous emission which always occurs independent of the number of photons present. 
Returning to (16), we plug in Fermi’s Golden Rule and obtain the rates 


An? = 
Bab => Boa = 3yp dal” and 
Au? 


a 3hc3 


\daal?. 


2.3. The photoelectric effect 


So far we have considered transitions between individual pairs of states. Ionization (aka the photo- 
electric effect) involves a transition from a bound state of some atom to one in which the electron is 
in a free state. This presents a few new challenges. First, we are used to treating unbound states as 
unnormalized, e.g. ¢)(Z) = e’**. Second, to calculate the ionization rate, we need to sum over final 
states, since the quantity of physical interest is the total rate of electrons being dislodged from an 
atom, and not the rate at which they transition to any specific final state. (A more refined analysis 
might look at the angular distribution of the final state.) 

Suppose our initial state is our old friend, the ground state of the hydrogen atom, and the 
final state is a plane wave. If the final state is unnormalized, then matrix elements such as 
(WanallV |@initial) become less meaningful. One way to fix this is to put the system in a box of 


size Lx Ex L with LS ap = as and to impose periodic boundary conditions. The resulting 
plane-wave states are now 
_  exp(ik-Z) 

UE (Z) = 7372” 
where |k) = 47% and 7% = (nj,n2,n3) is a vector of integers. (We use k instead of p = hk to keep 
the notation more compact.) We will assume that DL >> ao and also that the final energy of the 
electron is >> 13.6 eV. This means that we can approximate the final state as a free electron and 
can ignore the interaction with the proton. 

Apply an oscillating electric field to obtain the time-dependent potential 


OH = eEox3 cos(wt) = V cos(wt). 


The rate of the transitions will be governed by the matrix element 


= eEo / 3 ( ros ‘) eEo 
k|V|1, 0,0) = ——— | d’rxr rg exp | —— —ik-#) = ———A 
ald Jf 7ap.L3 ere ag Jf 7a3.L3 


A 


We have defined A to equal the difficult-to-evaluate integral. The factor of 73 can be removed by 


12 


writing A = igeB, where 


B= | bs exp (= ~ ik) 
ao 
= as a dp exp (-2 = inns) defining wy = 20k = lk |, 7 = |r| 
ag kr 


=A fa usin re OT = 
"eae 8 +1 
0 a” 


x 


Ea 
Ani ft 1 
-F fm 

(4+ a5 

1 tee 
is: 271 1 ip _ 2m (4+ #5) (1 a) 
em 2 Ee ars aed 
Gay ow) (+) 

Sti ahaa Sr Sit 


a) 2 2 —2\2 
i (1 P wuz) kAag (1 ne =a) ag (k? + a9") 


To compute A, we use the fact that Bek = 2k3. Thus 


cee O Be —32rikg 7 
Oks ag (k? + ap”) 


We can simplify this expression using our assumption that the photon energy (and therefore also 
the final state ani) is much larger than the binding energy. The final energy is ie and the 


binding energy is a Thus 
0 


h2k? hi 
2m 2maz 


=> kag > 1. 


We can use this to simplify A by dropping the ap ? term in the dominator: 


—32rik3 _ —32micos(@) 
agk® i agk® , 


~~ 


where @ is the angle between k and the z-axis. 
We can now compute the squared matrix element to be (canceling a factor of 7 from numerator 
and denominator) 
e? E2 10247 cos?(0) 
ag L3 azk10 


[(k|V|1, 0, 0)/? = 


To compute the average rate over some time ¢, we will multiply this by ; ue f(ta ws where f(t,a) = 


t 
ened) and ha is the difference in energy between the initial and final saken: If we neglect the 


energy of the initial state, we obtain 


hk? 
_ 10247e? EG cos?(6) f («. 2m w) 
1,0,0+k hazklO Ls t 


13 


We can simplify this a bit by averaging over all the polarizations of the light. (In fact, the angular 
dependance of the free electron can often carry useful information, but here it will help simplify 
some calculations.) The average of cos?(@) over the sphere is 1/3 (by the same arguments we used 
in the derivation of Fermi’s golden rule), so we obtain 


hk? 
_ 1024ne2E? f («. oma w) 
(Ry 0048) = Bh2apk!0L3 t 


Let’s pause for a minute to look at what we’ve derived. One strange feature is the 1/L° term, 
because the rate of ionization should not depend on how much empty space surrounds the atom. 
Another strange thing appears to happen when we take ¢ large, so that f(t,a)/t will approach 
50(a). This would cause the transition rates to be nonzero only when 2mw/h exactly equals k? for 
some valid vector k (i.e. of the form 27H). We do not generally expect physical systems to have 
such sensitive dependence on their parameters. 

As often happens when two things look wrong, these difficulties can be made to “cancel each 
other out.” Let us take ¢ to be large but finite. It will turn out that t needs to be large only relative 
to cee which is not very demanding when L is large. In this case, we can approximate f(t, a) 
with a step function: 


TT 42 * 1 


0 otherwise 


f(t,a) © f(t,a) = 


In what sense is this a good approximation? We argue that for large t, f(t, a)/t ¥ 50(q@), just like 
f(t,a)/t. Suppose that g(a) is a function satisfying |g’(a)| < C for all a. Then 


a ae 1/t 
| [oe ( 2) _5)) g(0)| = ( inate) ~ 4(0) 
—oo 2 0 


1/t 


a : doc (g(a) — g(0)) 


a do ih ingta] 


1/t a 
<t i da | dB | J (8)| triangle inequality 
0 0 


This tends to 0 as t > oo. (This is an example of a more general principle that the “shape” of a 6 
function doesn’t matter. For example, the limit of a Gaussian distribution with ¢? — 0 would also 
work.) 


Now using f (t, a), we get a nonzero contribution for k satisfying 


How many k satisfy (17)? Valid k live on a cubic lattice with spacing 27/L, and thus have density 
(L/27)?. Thus we can estimate the number of k satisfying (17) by (L/27)° times the volume of 


k-space satisfying (17). This in turn corresponds to a spherical shell of inner radius \/ aw and 


thickness 2m J. Thus we have 


= rye 2mw ore, 4 L?m /2mw  L>mk 
tp ee (=) - ( h ) 2tw 2n*htV Ah 2? ht 


In the last step we use the fact that spherical shell is thin to approximate k ~ Pr Thus, when 


we sum f(t,a)/t over k we obtain 


z 7B 
5 ee ta 
= t 2 7 
i 


We have obtained our factor of L° that removes the unphysical dependence on the boundary 
conditions. Putting everything together we get 


256me? 2 
Ai aos Bae 


3 Adiabatic evolution 


3.1 The adiabatic approximation 


We now turn to a different kind of approximation, in which we consider slowly varying Hamiltonians. 
We will consider a time-dependent Hamiltonian H(t). Let |w,,(¢)) and E(t) be the “instantaneous” 
eigenbases and eigenenergies, defined by 


A(t)|bn(t)) = En(t)|Yn(t)) Ex (t) S$ Falt) S... (18) 


We also define |W(t)) to be the solution of Schrédinger’s equation: i.e. 
set 
tha |W(t)) = H(4)|¥@)). (19) 


Beware that (18) and (19) are not the same. You might think of (18) as a naive attempt to solve 
(19). If the system starts in |7,(0)) at time 0, there is of course no reason in general to expect that 
|Wn(t)) will be the correct solution for later t. And yet, the adiabatic theorem states that in some 


cases this is exactly what happens. 


Theorem 1 (Adiabatic theorem). Suppose at t = 0, |W(0)) = |w,(0)) for some n. Then if H is 
changed slowly for0<t<T, then at time T we will have |V(T)) © |Wn(T)). 


This theorem is stated in somewhat vague terms, e.g. what does “changed slowly” mean? H 
should be small, but relative to what? One clue is the reference to the n“ eigenstate |w,(t)). 
This is only well defined if E,,(t) is unique, so clearly the theorem fails in the case of degenerate 
eigenvalues. And since the theorem should not behave discontinuously with respect to H(t), it 
should also fail for “nearly” degenerate eigenvalues. This gives us another energy scale to compare 
with H (which has units of energy/time, or energy squared once we multiply by h). We will see 
later the sense in which this can be shown to be the right comparison. 


15 


Example. Suppose we have a spin-1/2 particle in a magnetic field B (t). Then the Hamiltonian 
is H(t) = gene - B(t). The adiabatic theorem says that if we start with the spin and B both 


pointing in the +2 direction and gradually rotate B to point in the # direction, then the spin will 
follow the magnetic field and also point in the & direction. Given that the Schrédinger equation 
prescribes instead that the spin precess around the magnetic field, this behavior appears at first 
somewhat strange. 


Derivation We will not rigorously prove the adiabatic theorem, but will describe most of the 
derivation. Begin by writing 
=> en(t)lYn(t)) 
n 


Taking derivatives of both sides we obtain 
mE W(t) Nae ee (t)|dn(t)) + en(t YIdn(t) = Lienle (t)|Yn(t)). 


Multiply both sides by (w,(t)| and we obtain 


the, = Excy — inS— (eldn)en- (20) 


Now we need a way to evaluate («,|¢n) in terms of more familiar quantities. 
Start with Abn) = En|tn) 
Apply © Hltbn) + Hin) = Enltbn) + Enltin) 
Apply (el (bal H dn) + Ex (bildn) = Ende + En (bel Pn) 


This equation has two interesting cases: k = n and k # n. The former will not be helpful in 
estimating (|W), but does give us a useful result, called the Hellmann-Feynman theorem. 


kAn (Weltn) = ee = oe 


In the last step, we used Hz, to refer to the matrix elements of H in the {|y,)} basis. 
Plugging this into (20) we find 


Sg Ay 
they = (Ex — ih(dele))cx — ih S| E, BB” (21) 
He 
adiabatic approximation n#k 
error term 


If the part of the equation denoted “error term” did not exist, then |c,| would be independent of 
time, which would confirm the adiabatic theorem. Furthermore, the error term is suppressed by a 
factor of 1/A,,%, where A,, = E, — Ex is the energy gap. So naively it seems that if H is small 
relative to A,, then the error term should be small. On the other hand, these two quantities do 
not even have the same units, so we will have to be careful. 


16 


Phases Before we analyze the error term, let’s look at the phases we get if the error term were 
not there. i.e. suppose that thé, = (Ex — ih(wr|de))cr. The solution of this differential equation is 


cn (t) = cy(0)e* ee) (22a) 
a(t) = —7 / Ex(t!)at! v(t) = / v4(t" at! y(t) = i(Welabe) (22b) 


The 6;(t) term is called the “dynamical” phase and corresponds to exactly what you’d expect from 
a Hamiltonian that’s always on; namely the phase of state k rotates at rate —E;,/h. The y(t) is 
called the “geometric phase” or “Berry phase” and will be discussed further in the next lecture. 
At this point, observe only that it is independent of h and that vz(t) can be seen to be real by 
applying d/dt to the equation (¢,|wW,) = 1. 


Validity of the adiabatic approximation Let’s estimate the magnitude of the error term in a 
toy model. Suppose that H(t) = Ho + iV, where Ho, V are time-independent and T is a constant 
that sets the timescale on which V is turned on. Then H = V/T. An important prediction about 
the adiabatic theorem is that if the more slowly H changes from Hp to Ho + V, the lower the 
probability of transition should be; i.e. increasing JT’ should reduce the error term, even if we 
integrate over time from 0 to T. 

Let’s see how this works. If the gap is always > A, then we can upper-bound the transition 
rate by some matrix element of ox: This decreases as T’ and A increase, which is good. But if we 
add up this rate of transitions over time J, then the total transition amplitude can be as large as 
~ V/A. Thus, going more slowly appears not to reduce the total probability of transition! 

What went wrong? Well, we assumed that amplitude from state n simply added up in state 
k. But if the states have different energies, then over time the terms we add will have different 
phases, and may cancel out. This can be understood in terms of time-dependent perturbation 
theory. Define &(t) = ec, (t). Observe that 


inet) = fd), (t)e 2"  c), (t) + ihe & () 
H 
= — Ey (te eg (t) + 7 O (Ey (t) — Rvg(t))cg(t) — ah S- oo en(t) 


Hn, 
= — hg (teu t) — th YT eH HM (0) 
n#£#k my 


In the last step we have used cp(t) = e4@,(t). Let’s ignore the 1,(t) geometric phase term (since 
our analysis here is somewhat heuristic). We see that the error term is the same as in (21) but 
with an extra phase of e#(@(—(), Analyzing this in general is tricky, but let’s suppose that the 
energy levels are roughly constant, so we can replace it with e~*”»**, where wy, = (En — Ex)/h. 


Now when we integrate the contribution of this term from t = 0 tot = T we get 


P Apne nxt V ec nkT _ 1 V AV 
eG ee T fw, Thw?, AT 


Finally we obtain that the probability of transition decreases with T. This can be thought of as a 
rough justification of the adiabatic theorem, but it of course made many simplifying assumptions 
and in general it will be only qualitatively correct. 


17 


This was focused on a specific transition. In general adiabatic transitions between levels m and 
nm are suppressed if 


bal <A = min(Em(t) — Pay |, (23) 


Landau-Zener transitions One example that can be solved exactly is a two-level system with 
a linearly changing Hamiltonian. Suppose a spin-1/2 particle experiences a magnetic field resulting 
in the Hamiltonian 


ut 
H(t) = Aoy + po 
for some constants A,v,T. The eigenvalues are +,/A? + (vt/T)?. Assuming v > 0, then when 
t = —oo the top eigenstate is |—) and the bottom eigenstate is |+). When t = oo these are reversed; 


|+) is the top eigenstate and |—) is the bottom eigenstate. When t = 0, the eigenstates are ae 


See diagram on black-board for energy levels. 

Suppose that A = 0 and we start in the |—) at t = —oo. Then at t = oo we will still be in 
the |—) state, with only the phase having changed. But if A > 0 and we move slowly enough then 
the adiabatic approximation says we will remain in the top eigenstate, which for t = oo will be 
|+). Thus, the presence of a very small transverse field can completely change the state if we move 
slowly enough through it. 

In this case, the error term in the adiabatic approximation can be calculated rather precisely 
and is given by the Landau-Zener formula (proof omitted): 


Qn? A?T 
hu 


Pr|transition] + exp (- 


Observe that it has all the qualitative features that we expect in terms of dependence on A, v,T, 
but that it corresponds to a rate of transitions exponentially smaller than our above estimate from 
first-order perturbation theory. Note that here “transition” refers to transitions between energy 
level. Thus starting in |—) and ending in |+) corresponds to “no transition” while ending in |—) 
would correspond to “transition,” since it means starting in the higher energy level and ending in 
the lower energy level. 


3.2 Berry phase 


Recall that the adiabatic theorem states that if we start in state |w,,(0)) and change the Hamiltonian 
slowly, then we will end in approximately the state 


clon (t) ein (t) Ibn (t)) (24a) 
6,(t) = -; ; E,, (t')dt' In(t) = / Un (t’ dt! Uy (t) = i(dn|dn) (24b) 


The phase y,,(t) is called the geometric phase, or the Berry phase, after Michael Berry’s 1984 
explanation of it. 

Do the phases in the adiabatic approximation matter? This is a somewhat subtle question. Of 
course an overall phase cannot be observed, but a relative phase can lead to observable interference 
effects. The phases in (22) depend on the eigenstate label n, and so in principle interference is 
possible. But solutions to the equation H(t)|n(t)) = En(t)|Wn(t)) are not uniquely defined, and 
we can in general redefine |w,,(¢)) by multiplying by a phase that can depend on both n and t. 


18 


To see how this works, let us consider the example of a spin-1/2 particle in a spatially varying 
magnetic field. If the particle moves slowly, we can think of the position 7(t) as a classical variable 
causing the spin to experience the Hamiltonian H(7r(t)). This suggests that we might write the 
state as a function of 7(t), as |Wn(r(t))) or even |W,(7)). If the particle’s position is a classical 
function of time, then we need only consider interference between states with the same value of r, 
and so we can safely change |w,(7’)) by any phase that is a function of n and 7. 

In fact, even if the particle were in a superposition of positions as in the two-slit experiment, 
then we could still only see interference effects between branches of the wavefunction with the same 
value of 7. Thus, again we can define an arbitrary (n,7)-dependent phase. 

More generally, suppose that H depends on some set of coordinates R(t t) = (i @) 245; (d)). 
The eigenvalue equation (18) becomes H(R R)|Un(R)) = En(R)\vn(R)) where we ieaye the time- 
dependence of R implicit. This allows us to compute even in situations where Risina superposition 
of coordinates at a given time t. 

To express 7;(t) in terms of |%,(R)), we compute 


Otc, pi diun(R)) dRi = wee 
Glin fh) = > Se dk ae 7 Valea(A))- 


t = dR R(t) 
alt) = 6 f (bnl¥ ghd) - Seat = I i(vn| glum) - aR 


The answer is in terms of a line integral, which depends only on the path and not on time (unlike 
the dynamical phase). 8 
How does this change if we reparameterize |m(R))? Suppose we replace |w,(R)) with |%,(R)) = 


eB(®)\y,(R)). Then the Berry phase becomes 


a 
3 
Fee 
ch 
aL] 
II 


RQ | Lg 2 R(t) <y . yee 
a(t) = ), (in(B)|¥ glbn(B)) aR =: i. (in BOF go jy, (B)) 
R(0) R(0) 
= y(t) + 6(R()) — 6(B(0)) 


Changing 6 only changes phases as a function of the endpoints of the path. Thus, we can eliminate 
the phase for any fixed path with R(t) 4 R(0), but not simultaneously for all paths. In particular, 
if a particle takes two different paths to the same point, the difference in their phases cannot be 
redefined away. More simply, suppose the path is a loop, so that R(0) = R(t). Then regardless of 
6B we will have 7p, = Jn. This suggests an important point about the Berry phase, which is that it 
is uniquely defined on closed paths, but not necessarily open ones. 

Suppose that R(t) follows a closed curve C. Then we can write 


inl] = $ ilvnl¥ glvn) a= $ A(R 
——SIO 
An(R) 
where we have defined the Berry connection A,(R) = i(dn|V gldn)- Note that it is real for the 
same reason that v(t) is real. 
In some cases, we can simplify 7,,[C]. If N = 1 then the integral is always zero, since the line 
integral of a closed curve in 1-d is always zero. In 2-d or 3-d we can use Green’s theorem or Stokes’s 


theorem respectively to simplify the computation of y,[C]. Let’s focus on 3-d, because it contains 
2-d as a special case. Then if S denotes the surface enclosed by curve C’, we have 


f Af d= ff (qx Ay)-da= ff By aa. 


19 


Here we define the Berry curvature D, = V BX A,, and the infinitesimal unit of area d@. We can 


write D,, in a more symmetric way as follows: 


SE i TO eh cee el ale dd 
(Pale = 4) esky, nl apg) =f Deu ( dR; dR, I (nl oR, aml) 


Because €;jk is antisymmetric in j,k and ay aie is symmetric, the second term vanishes and we 
are left with 
Dn = U(Vr(Wnl) X (Wrldn))- (25) 


Example: electron spin in a magnetic field. The Hamiltonian is 


eh 


H=ypé-B ae 


Suppose that B = BF where B is fixed and we slowly trace out a closed path in the unit sphere 
with r. Suppose that we start in the state 


sin(@) cos(@ 

Segees _ cos(6/2) ; folly @) @) 

7+) = |r) = i sin(8/2) with r= | sin(0) sin(¢) 
cos(@) 


Then the adiabatic theorem states that we will remain in the state |7) at later points, up to an 
overall phase. To compute the geometric phase observe that 
> d ld». see a 
V=-—7 ; 
ay) Tein ab 


Since £ lr) = 0 we have 


“ss 1 —sin(@/2 - 1 0 Z 
ee eae ee ae | 
2r \ gio cos(6/2) rsin€ \ jei¢ sin(6/2) 
a 
|r) 


This first term will not contribute to the Berry connection, and so we obtain 


r sin(@) 
Finally the Berry curvature is 


Dee pie =e 


D,=VxA.= x 
a is r2 sin 6 dO Qr2 


d. = 
eT.) (sin 0A+,4) r= 


fteeet = sin@. We can now compute the 


For this last computation, observe that a sin?(6/2) = 4 


Berry phase as 
a ae 


r2dQr 


20 


Here dQ is a unit of solid angle, and (Q is the solid angle contained by C. 
What if we used a different parameterization for |7)? An equally valid choice is 


e~*? cos 
y= (OO) (26) 


sin(6/2) 
If we carry through the same computation we find that now 
> 1 cos?(0/2) » ~ 1 
=S SS d D ne ee Ps 
4 r  sin(0) “a < = Ore 


We see that the 4 sin?(0/2) was replaced by a Z(- cos*(0/2)) which gives the same answer. This 
is an example of the general principle that the Berry connection is sensitive to our choice of phase 
convention but the Berry curvature is not. Accordingly the Berry curvature can be observed in 
experiments. 

What if we started instead with the state |r; —) = |—7)? Then a similar calculation would find 
that 


Pee 50 


Since the two states pick up different phases, this can be seen experimentally if we start in a 
superposition of |7;-+) and |7;—). 

More generally, if we have a spin-s particle, then its z component of angular momentum can be 
anything in the range —s < m < s and one can show that 


%m|[C] = —mQ. 


There is much more that can be said about Berry’s phase. An excellent treatment is found in the 
1989 book Geometric phases in physics by Wilczek and Shapere. There is a classical analogue called 
Hannay’s phase. Berry’s phase also has applications to molecular dynamics and to understanding 
electrical and magnetic properties of Bloch states. We will see Berry’s phase again when we discuss 
the Aharonov-Bohm effect in a few weeks. 


3.3. Neutrino oscillations and the MSW effect 


In this section we will discuss the application of the adiabatic theorem to a phenomenon involving 
solar neutrinos. The name neutrino means “little neutral one” and neutrinos are spin-1/2, electri- 
cally neutral, almost massless and very weakly interacting particles. Neutrinos were first proposed 
by Pauli in 1930 to explain the apparent violation of energy, momentum and angular momentum 
conservation in beta decay. (Since beta decay involves the decay of a neutron into a proton, an 
electron and an electron antineutrino, but only the proton and electron could be readily detected, 
there was an apparent anomaly.) 

Their almost complete lack of interaction with matter (it takes 100 lightyears of lead to ab- 
sorb 50% of a beam of neutrinos) has made many properties of neutrinos remain mysterious. 
Corresponding to the charged leptons e~,e* (electron/position), u~ /u* (muon/antimuon) and 
tT /r* (tau/antitau), neutrinos (aka neutral leptons) also exist in three flavors: ve,v,,V;, with 
antineutrinos denoted %,v,,7-. Most, but not all, interactions preserve lepton number, defined 
to be the number of leptons minus the number of antileptons. Indeed, most interactions preserve 
#e” + #v. — #et — #* (electronic number) and similarly for muons and taus. However, these 
quantities are not conserved by neutrino oscillations. Even the total lepton number is violated by 
a phenomenon known as the chiral anomaly. 


21 


Solar neutrinos Solar neutrinos are produced via the p-p chain reaction, which converts (via a 
series of reactions) 
AH = 4p* + de 1 2pt + 2n + 2Wc™ +2ve. 
e—m—+— 
4He 
The resulting neutrinos are produced with energies in the range 0.5-20MeV. Almost all of neutrinos 
produced in the sun are electron neutrinos. 


Detection Neutrinos can be detected via inverse beta decay, corresponding to the reaction 
Atmos +e, 


where A, A’ are different atomic nuclei. For solar neutrinos this will only happen for electron 
neutrinos because the reaction A+ vy, ++ A’+ ys will only happen for mu neutrinos carrying 
at least 108 MeV of kinetic energy. So it is easiest to observe electron neutrinos. However other 
flavors of neutrinos can also be detected via more complicated processes, such as neutrino-mediated 
disassociation of deuterium. 


Observations of solar neutrinos The first experiment to detect cosmic neutrinos was the 1968 
Homestake experiment, led by Ray Davis, which used 100,000 gallons of dry-cleaning fluid (C2Cl,) 
to detect neutrinos via the process 37Cl + », +93" Ar + e~. However, this only found about 1/3 as 
many neutrinos as standard solar models predicted. 

In 2002, the Sudbery Neutrino Observatory (SNO) measured the total neutrino flux and found 
that once mu- and tau-neutrinos were accounted for, the total number of neutrinos was correct. 
Thus, somehow electron neutrinos in the sun had become mu and tau neutrinos by the time they 
reached the Earth. 


Neutrino oscillations The first high-confidence observations of neutrino oscillations were by 
the Super Kamiokande experiment in 1998, which could distinguish electron neutrinos from muon 
neutrinos. Since neutrinos oscillate, they must have energy, which means they must have mass 
(if we wish to exclude more speculative theories, such as violations of the principle of relativity). 
This means that a neutrino, in its rest frame, has a Hamiltonian with eigenstates |v1), |v2), |v3) 
that in general will be different from the flavor eigenstates |v-),|v,,), |r) that participate in weak- 
interaction processes such as beta decay. 
We will treat this in a simplified way by neglecting |v3) and |v-). So the Hamiltonian can be 
modeled as (in the |v), |v,,) basis) 
Ee A 
H= 4 (27) 
A Ex 


where E-, E,, are the energies (possibly equal) of the electron and muon neutrinos and A represents 
a mixing term. Unfortunately, plugging in known parameter estimates for the terms in (27) would 
predict that roughly a 0.57 fraction of solar neutrinos would end up in the |v) state, so this still 
cannot fully explain our observations. 


The MSW effect It turns out that this puzzle can be resolved by a clever use of the adiabatic 
theorem. Electron neutrinos scatter off of electrons and thus the Hamiltonian in (27) should be 


22 


modified to add a term proportional to the local density of electrons. Thus after some additional 
rearranging, we obtain 


Ke Gwat86)- Agsin(aé CN. 0 
H=54 CSOT EET. ; (28) 


Ao sin(20) Ao cos(20) 0 0 


where Ag, @ come from (27) (@ © 7/6 is the “mixing angle” that measures how far the flavor states 
are from being eigenstates), C is a constant and N, = N-(7) is the local electron density. If the 
neutrino is traveling at speed = ¢ in direction #, then r ~ ctg. Thus we can think of Nz as 
time-dependent. We then can rewrite H as 


const - I + (ae — Ao cos(20) oz + Ap sin(20)az. (29) 


This looks like the adiabatic Landau-Zener transition we studied in the last lecture, although here 
the o, term is no longer being swept from —co to +o0. Instead, near the center of the sun, N.(0) 
is large and the eigenstates are roughly |v-),|v,). For large t, the neutrinos are in vacuum, where 
their eigenstates are |v), |v2). 

If the conditions of the adiabatic theorem are met, then neutrinos that start in state |v.) (in 
the center of the sun) will emerge in state |v2) (at the surface of the sun). They will then remain 
in this state as they propagate to the Earth. It turns out that this holds for neutrinos of energies 
= 2MeV. In this case, the probability of observing the neutrino on Earth in the |v.) state (thinking 
of neutrino detectors as making measurements in the flavor basis) is sin?(9), which gives more or 
less the observed value of 0.31. 


3.4 Born-Oppenheimer approximation 


Consider a system with N nuclei and n electrons. Write the Hamiltonian as 
h2 = 9 
H=-)>—_V , + Ha(R). (30) 


Here R = (Ri, ue , Ry) denotes the positions of the N nuclei and H,)(R) includes all the other 
terms, i.e. kinetic energy of the electrons as well as the potential energy terms which include 
electron-electron, nuclei-nuclei and electron-nuclei interactions. Let r denote all of the coordinates 
of the electrons. While (30) may be too hard to solve exactly, we can use a version of the adiabatic 
theorem to derive an approximate solution. 

We will consider a product ansatz: 


V(R,r) = 7(R)Or(r), (31) 
where the many-electron wavefunction is an eigenstate of the reduced Hamiltonian: 
H.i(R)Or(r) = Ea(R)®r(r). (32) 


(Typically this eigenstate wil be simply the ground state.) This is plausible because of the adiabatic 
theorem. If the nuclei move slowly then as this happens the electrons can rapidly adjust to remain 
in their ground states. Then once we have solved (32) we might imagine that we can substitute 


23 


back to solve for the nuclear eigenstates. We might guess that they are solutions to the following 
eigenvalue equation 
N 


2 a 
oy a + Ea(R) | 7(R) = £7(R). (33) 
j=l 


However, this is not quite right. If we apply V «. to (31) we obtain 
R; a 


Va, ¥(R.r) = (Vg.7(R)) Or (rt) + 7(R)V zg Or(r). (34) 


Using the adiabatic approximation we neglect the overlap of V 7, ¥(R,r) with all states to |®R). 
J 
Equivalently we can multiply on the left by (®g|. This results in 


/ @'rOp(r)"V gz UR, r) = Va,7(R) +7(R) / @rOR(r)V g On(r) 


where A; is the familiar Berry connection 


Aj = i(Or|V 5, |OR). (35) 
We conclude that the effective Hamiltonian actual experienced by the nuclei should be 
N72 | . 
Hog = dX aM, (Vz, — iAj)? + Ea(R). (36) 


We will see these A; terms again when we discuss electomagnetism later in the semester. In 
systems of nuclei and atoms we need at least three nuclei before the A; terms can have an effect, 
for the same reason that we do not see a Berry phase unless we trace out a loop in a parameter 
space of dimension > 2. 

The Born-Oppenheimer applies not just to nuclei and electrons but whenever we can divide a 
system into fast and slow-moving degrees of freedom; e.g. we can treat a proton as a single particle 
and ignore (or “integrate out”) the motion of the quarks within the proton. This is an important 
principle that we often take for granted. Some more general versions of Born-Oppenheimer are 
called “effective field theory” or the renormalization group. 


4 Scattering 


4.1 Preliminaries 


One of the most important types of experiments in quantum mechanics is scattering. A beam of 
particles is sent into a potential and scatters off it in various directions. The angular distribution 
of scattered particles is then measured. In 8.04 we studied scattering in 1-d, and here we will 
study scattering in 3-d. This is an enormous field, and we will barely scratch the surface of it. In 
particular, we will focus on the following special case: 


e Elastic scattering. The outgoing particle has the same energy as the incoming particle. This 
means we can model the particles being scattered off semi-classically, as a static potential 
V(r). The other types of scattering are inelastic scattering, which can involve transformation 
of the particles involved or creation of new particles, and absorption, in which there is no 
outgoing particle. 


24 


e Non-relativistic scattering. This is by contrast with modern accelerators such as the LHC. 
However, non-relativistic scattering is still relevant to many cutting-edge experiments, such 
as modern search for cosmic dark matter (which is believed to be traveling at non-relativistic 
speeds). 


Even this special case can teach us a lot of interesting physics. For example, Rutherford scattering! 
showed that atoms have nuclei, thereby refuting the earlier “plum pudding” model of atoms. This 
led to a model of atoms in which electrons orbit nuclei like planets, and resolving the problems of 
this model in turn was one of the early successes of quantum mechanics. 


Scattering cross section: In scattering problems it is important to think about which physical 
quantities can be observed. The incoming particles have a flux that is measured in terms of number 
of particles per unit area per unit time, i.e. as, If we just count the total number of scattered 
particles, then this is measured in terms of particles per time: BN scot The ratio of these quantities 


has units of area and is called the scattering cross section: 


=o: (37) 


To get a sense of why these are the right units, consider scattering of classical particles off of a 

classical hard sphere of radius a. If a particle hits the sphere it will scatter, and if it does not hit 

the sphere it will not scatter. Assume that the beam of particles is much wider than the target, 

ie. each particle has trajectory 7 = (xo, yo, 20 + vt) with ./z + y@ given by a distribution with 

standard deviation that is > a. The particles that scatter will be the ones with \/zZ+y% < a 

which corresponds to a region pees area ta”, which is precisely the cross-sectional area of the 
scat in 


sphere. Since we have “gt = 77m ma, it follows that ¢o = ta”. This simple example is good to 


keep in mind to have intuition about the meaning of scattering cross sections. 


Differential cross-section: We can get more information out of an experiment by measuring 
the angular dependence of the scattered particles, The number of scattered particles can then be 
measured in terms of a rate per solid angle, i.e. a Nace The resulting differential cross-section de 
is defined to be 


do d? Nscat 
in?) = “ding (38) 
dAdt 


Here the spherical coordinates (6, ¢) denote the direction of the outgoing particles. It is conventional 
to define the axes so that the incoming particles have momentum in the Z direction, so @ is the angle 
between the scattered particle and the incoming beam (i.e. @ = 0 means no change in direction 
and 6 = 7 means backwards scattering) while ¢ is the azimuthal angle. Integrating over all angles 


gives us the full cross-section, i.e. 
d 
o= / dn, (39) 


Quantum mechanical scattering: Assume that the incoming particle states are wavepackets 
that are large relative to the target. This allows us to approximate the incoming particles as plane 
wave, 1.e. 

Pin x ether, (40) 


‘Rutherford scattering is named after Ernest Rutherford for his 1911 explanation of the 1909 experiment which 
was carried out by Geiger and Marsden. 


25 


where EF = Ie us Here we need to assume that the potential V(7’) + 0 as r > oo so that plane 


waves are solutions to the Schrédinger equation for large r. For the scattered wave, we should seek 
solutions satisfying 


a 
— 5 V"Wscat = EW scat as T — CO (41) 
1a? 1: 
gts — EF | abecns = B eens in spherical coordinates (42) 
r dr r2 


A general solution can be written as a superposition of separable solutions. Separable solutions to 
(42) can in turn be written as 


ry(r, 6, >) = ulr)f(9, >), (43) 
in terms of some functions u(r), f(@,¢). In terms of these (42) becomes 
uf + <ul? f +k?uf =0. (44) 
ae 


+0 as roo 


Thus, for large r, we can cancel the f from each side and simply have wu’ = —k?u, which has 
solutions e***", The e’*" solution corresponds to outgoing waves and the e~?*" 
waves. A scattered wave should be entirely outgoing, and so we obtain 
0 on iE 
Weeat F090 f( ,) git (45) 


r 


Wscat = H(G, O) ir 8 +O (=) . (46) 


Tr 


solution to incoming 


or more precisely 


Because the scattering is elastic, the k and FE here are the same as for the incoming wave. 


Time-independent formulation: As with 1-d scattering problems, the true scattering process 
is of course time-dependent, but the quantities of interest (transmission/reflection in 1-d, differential 
cross section in 3-d) can be extracted by solving the time-independent Schrédinger equation with 
suitable boundary conditions. In the true process, the incoming wave should really be a wavepacket 
with well-defined momentum * (0,0, k) and therefore delocalized position. The outgoing wave will 
be a combination of an un-scattered part, which looks like the original wave packet continuing 
forward in the 2 direction, and a scattered part, which is a spherical outgoing wavepacket with a 
f(0,¢) angular dependence. However, we can treat the incoming wave instead as the static plane 


kz and the scattered wave instead as the static outgoing wave £0) cikr (Both of these 


wave et 

di 
are when r — oo.) Thus we can formulate the entire scattering problem as a time-independent 
boundary-value problem. The high-level strategy is then to solve the Schrédinger equation subject 


to the boundary conditions 
f(9, ?) etkr 


i ae 


(47) 


This is analogous to what we did in 1-D scattering, where the boundary conditions were that »(z) 
should approach e“** + Re~*** for 2 + —oo and should approach Te*** for z > oo. As in the 1-D 
case, we have to remember that this equation is an approximation for a time-dependent problem. 
As a result when calculating observable quantities we have to remember not to include interference 
terms between the incoming and reflected waves, since these never both exist at the same point in 
time. 


26 


Relation to observables. In the 1-D case, the probabilities of transmission and reflection are 
|T|? and |R|? respectively. In the 3-d case, the observable quantities are the differential cross 


sections ae To compute these, first evaluate the incoming flux 


+ oh J h wer hk 
Sin =< Imdt Vein = Im (cel) a ae (48) 
mm mm mm 


In the last step we have used v = hk/m. The units here are off because technically the wavefunction 
should be not e“** but something more like e’**//V, where V has units of volume. But neglecting 
this factor in both the numerator and denominator of (38) will cause this to cancel out. Keeping 
that in mind, we calculate the denominator to be _ 


d? Nin a 
sage = |Sinl =. (49) 


Similarly the outgoing flux is (using V = oF - 126 - 1 a6 2b) 


Gell = ae feo \ lan ary (50) 
scat = Ai m - URTr Z @ =e r 
To relate this to the flux per solid angle we use d@ = r?dQF to obtain 


@ Necat 
dQ dt 


= Scat dé = (oUF? + O(r73)) - 7? dQ# = | f|?udQ + O(1/r). (51) 


We can neglect the O(1/r) term as r + oo (and on a side note, we see now why the leading-order 
term in Sscat was O(1/ r2)) and obtain the simple formula 


< = 100.9)? (52) 


The Optical Theorem. (52) is valid everywhere except at 6 = 0. There we have to also 
consider interference between the scattered and the unscattered wave. (Unlike the incoming wave, 
the outgoing unscattered does coexist with the scattered wave.) The resulting flux is 


5 h : me ; iIkp 
Sout = — Im Ga + Few) [ue eee OU jr?) (53a) 
m r r 
= wE, + oSlrP +vuRe (cf af een) 4 rE gate) (53b) 
Sao : 
Secat S. 


Sinterference 


This last term can be thought of as the effects of interference. We will evaluate it for large r and 
2 
for 6 + 0. Here # © Z and we define p = \/x? + y? so that (to leading order) r = z + &. Then 


[8s Sinter «da = v / = da [ pdp— = (fren fre thie oe fet* az *) (54a) 
Ag “Re [* W fe - (54b) 


_ 2m 


= Re [ dye 2 f(0 )=———Im f(0) (54c) 


27 


Since the outgoing flux should equal the incoming flux, we can define A to be the beam area and 


find 
Aru 


Av = Av+ v f anis(6,0)? 4 Im f (0). (55) 


Thus we obtain the following identity, known as the optical theorem: 


[ elt0, 0)? = Fm fo) (56) 


All this is well and good, but we have made no progress at all in computing f(6,¢). We will 
discuss two approximation methods: the partial wave method (which is exact, but yields nice 
approximations when & is very small) and the Born approximation, which is a good approximation 
when we are scattering off a weak potential. This is analogous to approximations we have seen 
before (see Table 1). We will discuss the Born approximation in Section 4.2 and the partial-wave 


perturbation type time-independent time-dependent scattering 


small TIPT TDPT Born 


slow WKB adiabatic partial wave 


Table 1: Summary of approximation techniques for scattering problems. 


technique in Section 4.3. 


4.2 Born Approximation 
Zooming out, we want to solve the following eigenvalue equation: 


2 
= “My 


(V2 + k?)|) =U|w) — where ad 


(57) 
This looks like a basic linear algebra question. Can we solve it by inverting (Vv? + k?) to obtain 

y= (V? + k?)1U |p)? (58) 

To answer this, we first review some basic linear algebra. Suppose we want to solve the equation 

A®=b (59) 


for some normal matrix A. We can write 6 = A~!z only if A is invertible. Otherwise the solution 
will not be uniquely defined. More generally, suppose that our vectors live on a space V. Then we 
can divide up V as 


V=ImA@kerA where ker A = {Zp : AZ = O}. (60) 


If we restrict A to the subspace Im A then it is indeed invertible. The solutions to (59) are then 
given by 

# = (Altima) 1b + Zp where Xo € ker A. (61) 

Returning now to the quantum case, the operator (Vv? +k?) is certainly not invertible. States 

satisfying (V? + k?)|~o) = 0 exist, and are plane waves with momentum hk. But if we restrict 


28 


(V2 +k) to the subspace of states with momentum # hk then it is invertible. Define the Green’s 
operator G to be (V2 + ae) Re The calculation of G is rather subtle and the details can be 
found in Griffiths. However, on general principles we can make a fair amount of progress. Since 
(v2 +k?) is diagonal in the momentum basis, then G should be as well. Thus G should be written 
as an integral over |p) (p] times some function of p. By Fourier transforming this function we can 
equivalently write G in terms of translation operators as 


Pp 


G= [ércmr, where Tz=e' 


>| 


(62) 


Let’s go through this more concretely. In the momentum basis we have the completeness relation 
{ @p\p) (p| = I which implies 


ae = faamy + PNA. (63) 


To invert this we might naively write G = [’ d?p(—h?p? + k?)~1|p)(p| where |” denotes the integral 
over all p with p #4 hk. To handle the diverging denominator, one method is to write 


G = ling f a1? +2 + iA. (64) 
Ee 

Finally we can write this in the position basis according to (62) and obtain the position-space 
Green’s function G(r) by Fourier-transforming (—h?p? + k? + ie)~!. In Griffiths this integral is 
carried out obtaining the answer: 


Z etkr 
This function G(7) is called a Green’s function. We can thus write 
—eikr 
G= | br Tp. (66) 
4nr 
Having computed G, we can now solve (57) and obtain 
|b) = |bo) + GUI) (67) 


for some free-particle solution |79). Indeed for a scattering problem, we should have wo(7) = e***. 
(67) is exact, but not very useful because |7) appears on both the LHS and RHS. However, it will 
let us expand |w) in powers of U. 

The first Born approximation consists of replacing the |W) on the RHS of (67) by |vo), thus 
yielding i 

Ib) = |¥o) + GU go). (68) 

The second Born approximation consists of using (68) to approximate |w) in the RHS of (67), which 
yields 


Ib) = |bo) + GU(|vo) + GU|o)) = |vo) + GU|Y0) + GUGU|yYo). (69) 


Of course we could also rewrite (67) as |¢) = (I — GU)“ |yW) = yin>o(GU)"|Yo) (since (I — GU) 
is formally invertible) and truncate this sum at some finite value of n. 


29 


These results so far have been rather abstract. Plugging in (65) and yYo(r) = e’k2 we find that 
the first Born approximation is 


HF) = dol(F) + / Br do GR -7)U) (70a) 
ike? 


U(7") (70b) 


If we assume that the potential is short range and we evaluate this quantity for r far outside the 
range of U, then we will have r > r’ for all the points where the integral has a nonzero contribution. 
In this case |? —7| » r—7#- 7". Let us further define 


k=kP and k =ké, (71) 


corresponding to the outgoing and incoming wavevectors respectively. Then we have (still in the 
first Born approximation) 


ea etklr— P| 
Banat T =- [ &e pe” An|F — P| = | ——— U(r) (72a) 
7 a ike! etkr— ikf-7" 4 
wi f diel —_uyy') (72b) 


( [ere i(k 8). U(r? a) ai (72c) 


The quantity in parentheses is then f(,¢). If we define V(q) to be the Fourier transform of V (7) 
then we obtain 


Mm = 
where f; refers to the first Born approximation. Ons very simple example is when V (7) = Vod(7). 
Then we simply have f; = — muy 


A further simplification occurs in the case when V (7) is centrally symmetric. Then 
VQ@= [emvoe* = an af dur?V (rye! = a =f drrV (r) sin(qr). (74) 
0 


Finally the momentum transfer ¢ = k’ — k satisfies q = 2k sin(/2). 
One application of this (see Griffiths for details) is the Yukawa potential: V(r) = —Ge""/r. 


The first Born approximation yields 
2m? 
Yukawa 

A) = — — —_., 
fn = GTP) 


Taking 6 = —eQ and pp = 0 recovers Rutherford scattering, with 


= 2meQ _ meQ) 


Rutherford 
0 = : 
fi e hq? 2h2k? sin?(6/2) 


A good exercise is to rederive the Born approximation by using Fermi’s Golden Rule and 
counting the number of outgoing states in the vicinity of a given k. See section 7.11 of Sakurai for 
details. Another version is in Merzbacher, section 20.1. 


30 


Rigorous derivation of Green’s functions. The above derivation of G was somewhat informal. 
However, once the form of G(7) is derived informally or even guessed, it can be verified that 
(V2 + k?)G acts as the identity on all states with no component in the null space of (V2 + k?). 
This is the content of Griffiths problem 11.8. Implicit is that we are working in a Schwartz space 
which rules out exponentially growing wavefunctions, and in turn implies that V2 has only real 
eigenvalues and is in fact Hermitian. A more rigorous derivation of the Born approximation can 
also be obtained by using time-dependent perturbation theory as described by Sakurai section 7.11 
and Merzbacher section 20.1. I particularly recommend the discussion in Merzbacher. 


4.3 Partial Waves 


In this section assume we have a central potential, i.e. V(7) = V(r). Since our initial conditions 
are invariant under rotation about the Z axis, our scattering solutions will also be independent of 
@ (but not 6). 
Assume further that 
lim r?V(r) = 0. (75) 


Too 
We will see the relevance of this condition shortly. 
A wavefunction with no ¢ dependence can be written in terms of Legendre polynomials” (cor- 
responding to the m = 0 spherical harmonics)as 


(7,8) = S> Ri(r)Pi(cos 8). (76) 


1=0 
If we define u(r) = rR)(r), then the eigenvalue equation becomes (for each /) 


2 +1 
— ul + Vogul = ku where Veg = mV Ul + ) 
h2 r2 


If (75) holds, then for sufficiently large r, we can approximate Veg © I(I + 1) /r?. In this region 
the solutions to (77) are given by the spherical Bessel functions. Redefining x = kr and using the 
assumption Veg © I(1 + 1)/r?, (77) becomes 


i+. 
GS), 


(77) 


UL. (78) 


This has two linearly independent solutions: xj;(x) and xnj;(a) where j;(), (x) are the spherical 
Bessel functions of the first and second kind respectively, and are defined as 


ite) = (-2)'(¢ ape and m(a) = —(-2) (2 ee (79) 


x dx x x dx x 


These can be thought of as “sin-like” and “cos-like” respectively. For | = 0,1, (79) becomes 


sin(2) _ sin(z) —cos(z) 


jo(a) = f(a) = SRY) _ ost (80a) 
re —— mi = cost) sate) (80b) 


’What are Legendre polynomials? One definition starts with the orthogonality condition f[ Pm(x)Pr(x)da = 


Ser Omn. (The bmn is the important term here; moa is a somewhat arbitrary convention.) Then if we apply the 
Gram-Schmidt procedure to 1,#,x”,..., we obtain the Legendre polynomials Py = 1, P; = 1, P2 = $ (3x? 1) eee 
Thus any degree-n polynomial can be written as a linear combination of Po,..., Pn and vice-versa. 


The reason for (76) is that if w(r,@) is independent of ¢ then we can write it as a power series in r and z, or 
equivalently r and = = cos(@). These power series can always be written in the form of (76). 


31 


One can check (by evaluating the derivatives in (79) repeatedly and keeping only the lowest power 


of 1/x) that as x + oo we have the asymptotic behavior 
1 1 
ji(z) z sin(x — Im/2) and ni(xz) > — cos(a — lm /2). (81) 


On the other hand, in the x > 0 limit we can keep track of only the lowest power of x to find that 
as x — 0 we have ; | 
; x 21 —1)!! 
Fi(&) = Qi+ Di and n(x) —=> gil? 
where (21 + 1)!! = (27 + 1)(2i — 1) ---3-1. 
If 7; and n; are sin-like and cos-like, it will be convenient to define functions that resemble 
incoming and outgoing waves. These are the spherical Hankel functions of the first and second 
kind: 


(82) 


pW = TI + ini and ne?) a Il = any (83) 


For large r, h\ (kr) => oy. and so our scattered wave should be proportional to A), More 
precisely, Ri(r) (i.e. the angular-momentum-! component) should be proportional to hn (kr). 


Putting this together we get 


w(r, 0) “er eke +S ht (kr) Py(cos(9)), (84) 
I>0 


Wscat 


for some coefficients c;. It will be convenient to write the c; in terms of new coefficients a; as 
ce, = ki'+1(21 + 1)aj, so that we have 


Uscat(r, 0) “EY? eS a+1(21 + 1)ayh{” (kr) P,(cos(9)) (85) 
I>0 
when r—-oo eke 
= S (2l + 1)a;P;(cos(@)) (86) 
1>0 
£(0) 


We can then compute the differential cross-section in terms of the a, 


2 


S > (21 + 1)ayPi(cos())| (87) 


l 


do 


= =f)? = 


Recall that the Legendre polynomials satify the orthogonality relation 


1 
2 
I. dzP,(z)Py(z) = Ou aT: (88) 
Thus we can calculate d 
_ og _ 2 
o= fog = oe te pay (89) 


While the differential cross section involves interference between different values of J, the total cross 
section is simply given by an incoherent sum over /. Intuitively this is because we can think of | and 
9 as conjugate observables, analogous to momentum and position. If we measure the probability 


32 


of observing something at a particular position, we will see interference effects between different 
momenta, but if we integrate over all positions, these will go away. 

The beauty of the partial-wave approach is that it reduces to a series of 1-d scattering problems, 
one for each value of J. We have written down the form of the scattered wave. For the incoming 
wave, we need to express e’*” in the form of (76). From the arguments above we can see that gike 


should have form }°)(Ajji(kr) + Bini(kr))P;(cos@). The specific solution is given by Rayleigh’s 
formula which we state without proof: 


etkz — eikrcos) _ SS i! (21 + 1)j,(kr) P;(cos 0) (90) 
I>0 


Plugging this in, we find that in the V = 0 region we have 


b(r,0) LY S*HH(21-+ 1) Pi(cos6) | julkr) +ékayh\” (kr) (91) 
I>0 a _—_—_—_—”’ 
plane wave scattered 
1 
= 50 (2i + 1)P;(cos8) hn (kr)(1 + 2ikar) +h? (kr) (92) 
ren outgoing incoming 


Now we have really expressed this as a series of 1-d scattering problems. Here comes the crucial 
simplifying move. Because we are assuming that the collision is elastic, probability and angular- 
momentum conservation means that ” what goes in must come out”; i.e. 


|1 + 2ikay| = 1. (93) 


The outgoing wave’s amplitude must have the same absolute value as the incoming wave’s ampli- 
tude. We can rewrite (93) by introducing the phase shift 6; defined by 


1+ 2ika, =e, (94) 


The factor of 2 is conventional. We can rewrite (94) to solve for a; as 


2% —1 et sin(d;) 1 1 
_ = = ' 95 
2ik k k cot(d,) — @ 2) 


Many equivalent expressions are also possible. In terms of the phase shifts, we can write 


Oe S (21 + 1) Pi(cos 6)e* sin(d}) (96) 
I 
= a (21 + 1) sin2(é;). (97) 


1 
As an application we can verify that this satisfies the optical theorem. Observe that P;(1) = 1 for 
all 1. Thus 
= Im #0) = = 7 + DR) sin?(51) =o. (98) 
i 


Another easy application is partial-wave unitarity, which bounds the total amount of scattered 
wave with angular momentum I. Define o; = $$(2I + 1) sin?(d), so that o = >, 0. Then using 
sin?(6;) < 1 we have 


o1 < = (21 +1). (99) 


This bound is called “partial-wave unitarity.” 


How to compute phase shifts. Let us look again at the r > oo solution. We can use the fact 
that overall phase and normalization don’t matter to obtain 


R,(r) = h\ (kre + nO) (kr) ( 
= (1+ 6) ji(kr) + i(e?”! — 1)ni(kr) ( 
x cos(d1)ji(kr) — sin(6,)ni (kr) ( 
x ji(kr) — tan(d,)ni(kr) ( 


Suppose that we know the interior solution R;(r) for r < 6 and that V(r) = 0 for r > b. Then we 


can compute the phase shift by matching z 2 for r=b+e. 


Here is a simple example. Consider a hard sphere of radius b. Then R;(r) = 0 for r < b and we 
have 


1 ( i(k) 
— . 101 
ji(kb) — tan(d,)n7(kb) = 0 6) = tan (a (101) 
One particularly simple case is when / = 0. Then 
: sin(kb) 
ae 1 ( jo(kb) = -1 kb = =1 = 
do = tan (a Stan Ton) = —tan “(tan(kb)) = —kb. (102) 


It turns out that in general repulsive potentials yield negative phase shifts. 
What about larger values of /? Suppose that kb < 1. Then using the x > 0 approximations 
for j,(x), n(x), we obtain 


k->0 —1 (kb)7/+1 a (eb 21+1 
Rye tay eer © ore ina — ayn ~ 8) aoe) 


Thus the / = 0 scattering dominates. In terms of cross-sections, we have 


At 


Ar 
= 7! 


21+ 1) sin? (1) © (20 + 1)((20— 1)t)* 


(kb) b?. (104) 


Ol 


For | = 0 this is 47b? which four times the classical value of 7b?, and for higher value of 1 this 
drops exponentially (assuming kb < 1). Even if kb > 1 this drops exponentially once | >> kb, which 
confirms our intuition that the angular momentum should be on the order of Akb. 

Another way to think about this reason to favor low values of | is because the I(J + 1)/r? 
term in Veg forms an angular momentum “barrier” that prevents low-energy incoming waves from 
penetrating to small enough r to see the potential V(r). 


The high-energy limit. The partial wave approximation is easiest to use in the low-k limit 
because then we can restrict our attention to a few values of /, or even just / = 0. But for the hard 
sphere we can also evaluate the kb >> 1 limit. In this case we expect to find angular momenta up 
to Imax = kb. Thus we approximate the total cross section by 


l 
4 max 
or a (21 + 1) sin?(6)). (105) 
l=0 


34 


The phases 6; will vary over the entire range from 0 to 27 so we simply approximate sin?(6,) by its 
average value of 1/2. Thus we obtain 
27 ie 
oR S (21+ 1) = 2nb?. (106) 
1=0 
This is now twice the classical result. Even though the particles are moving quickly they still 
diffract like waves. One surprising consequence is that even though a hard sphere leaves a shadow, 
there is a small bright spot at the center of the shadow. Indeed the optical theorem predicts that 
Im f(0) = po ~ kb?/2. Thus | f(0)|? > (kb)?b?/4. For a further discussion of this bright spot and 
the role it played in early 19th-century debates about whether light is a particle and/or a wave, 
look up “Arago spot” on wikipedia. 


Phase shifts. As we have seen, scattering can be understood in terms of phase shifts. Now we 
describe a simple physical way of seeing this. If V = 0 then a plane wave has ug(r) = sin(kr), 
due to the uo(0) = 0 boundary condition. When there is scattering, the phase shift 69 will become 
nonzero and we will have 


uo(r) = sin(kr + 6p). 


If the potential is attractive then the phase will oscillate more rapidly in the scattering region and 
so we will have d9 > 0 while if it is repulsive then the phase will oscillate more slowly and we will 
have 69 < 0. See Fig. 2 for an illustration. 


no scattering 
—— positive phase shift 
——— negative phase shift 


Figure 2: The phase shift 69 is positive for attractive potentials and negative for repulsive potentials. 


Scattering length. In the regime of low k it turns out that many potentials behave qualitatively 
like what we have seen with the hard sphere, with a characteristic length scale called a “scattering 
length.” To derive this suppose there is some b such that V(r) = 0 for r > b. In this region we 
have uo(r) * sin(kb + 60) (neglecting normalization). In the vicinity of b we have 


u(r) = uo(b) + u9(b)(r — 6). 


30 


If we extrapolate to smaller values of r our approximation hits 0 at r = a where 


= uo(b) <3 tan(kb + do) 
a=b wh (6) =b i : 


Using the tan addition formula and taking the limit / — 0 we find 


1 tan(kb) + tan(do) Shag kb + tan(do) = tan(do) (107) 


GO tan eta On) k k 


Rearranging we have tan(d9) = —ka, and in the ka < 1 limit this yields 


4 
do= — sin?(d9) & 47a”, 


which is again the hard sphere result. Similar results hold for larger value of |. Thus the scattering 
length can be thought of as an effective size of a scattering target. 


36 


8.06 Spring 2016 Lecture Notes 


3. Entanglement, density matrices and decoherence 


Aram W. Harrow 


Last updated: May 18, 2016 


Contents 


1 Axioms of Quantum Mechanics 
Id ie BURGE ee i Sey sk Si a ek Be ee oe es GE ew ee 
2: SST, a ee A ee Re ee Ra le ee Oe oe wk ae dee 
1.3. The problem of partial measurement ...........-. 0.00 eee ee ee 


2 Density operators 
2.1 Introduction and @enniition < 2.2 64 444.0444 % 64.0642 6B ERE RR we ES 
22 HEemples ...4.4.624¢ 8 )¢ 8044454044220 Dae de ebb eo eae aad 


3 The general rule for density operators 
3.1 Positive semidefinite matrices .. 2... ee 
3.2 Proof of the density-matrix conditions .............00 00020202 eee 
3.3 Application to spin-1/2 particles: the Bloch ball ............000 0004+ 


4 Dynamics of density matrices 
4]. BEMTOCINGEP GOUBION «2... ee hE Oe A ee ee ee ee Ree eR ee ES 
4.2 _ Wieasuremiewi <u ee ee ce ae ee Be a ee 
Ad Decohepence. . 6 ee ew ee ee ee es 


5 Examples of decoherence 
5.1 Looking inside a Mach-Zehnder interferometer... .......... 0.000002 ee 
2 spin fotabions it NMR .. 2. ek 6 4 je eae oe owe ee Be ee ee 
5.3 BSpontanequs emission . . 2 6 ee ke ee eee ee 


6 Multipartite density matrices 
bl Product states . 2 2 oboe EEA Re ee ee eee Pe be ee 
6.2 Partial measurement and partial trace... 2... ee 
6.2 — Pupimcationg.. 4. 2 kes a4 bee ee ee Pe oe ee ES ee we 


1 Axioms of Quantum Mechanics 


We begin with a (very) quick review of some concepts from 8.04 and 8.05. 


1.1 One system 
States are given by unit vectors |q) € V for some vector space V. 
Observables are Hermitian operators A € L(V). 


Measurements 
Suppose A = ear Ai|vi) (v;| for {]v1) ...,|va)} an orthonormal basis of eigenvalues and (for 
simplicity) each , distinct. If we measure observable A on state |7) then the outcomes are 
distributed according to Pr[Aj] = |(q|v;)|°. 


Time evolution is given by Schrédinger’s equation: inZ |) = H|w) where H is the Hamiltonian. 


Heisenberg picture We can instead evolve operators in time using 
LAG. a ds 
the An = (Ay, Hy]. (1) 


Time-independent solution 
If the Hamiltonian does not change in time, then the time evolution operator for time t 
is the unitary operator U =e» . The state evolves according to |W(t)) = U|w(0)) in the 
Schrédinger picture or the operator evolves according to Ay(t) = Ut Ay (0)U in the Heisenberg 
picture. 


Systems are described by a pair (V, H). 


1.2 Two systems 
Let’s see how things change when we have two quantum systems: (Vi, H1) and (V2, H2). 


States are given by unit vectors |q) € V where 
V = Vi @ V2 = span{|{1) ® |2) : |) € Vi, [p2) € Vo}. 


A special case are the product states of the form |71) ® |w2). States that are not product are 
called entangled. 


Observables are still Hermitian operators A € L(V). A general observable may involve inter- 
actions between the two systems. Local observables are of the form A @ I, I @ B or more 
generally A@I+I@B, and correspond to properties that can be measured without interacting 
the two systems. 


Measurements 
The usual measurement rule still holds for collective measurements. But when only one 
system is measured, we need a way to explain what happens to the other system. Suppose 
we measure the first system using the orthonormal basis {|v1),...,|va)}. (Equivalently, we 
measure an operator with distinct eigenvalues and with eigenvectors |v1),...,|va).) If the 
overall system is in state |v), then the first step is to write |q) as 


d 
Iw) = 5° Vpilvs) ® lus), 


i=1 
for some unit vectors |w1),...,|wa) (not necessarily orthogonal) and some pj,..., pq such that 
pi = O and ee rea 


Then the probability of outcome 7 is p; and the residual state in this case is |v;) ® |wi). 


Time evolution is still given by Schrédinger’s equation, but now the joint Hamiltonian of two 
non-interacting systems is 


H=f,@I+1® Ho. (2) 
Interactions can add more terms, such as the peel Coulomb interaction, which generally 


cannot be written in this way. Note that a Hamiltonian term of the form Ay Q Ay does 
represent an interaction; e.g. 0, ®o0, has energy +1 depending on whether the two spins have 
Z components pointing in the same or opposite directions. 


Time-independent solution 
For a Hamiltonian of the form in (2), the time evolution operator is 


You should convince your that this second equality is true. Of course if the Hamiltonian 
contains interactions then U/ will generally not be of this form. 


These principles are actually profoundly different from anything we have seen before. For 
example, consider the number of degrees of freedom. One d-level system needs d complex numbers 
to describe (neglecting normalization and the overall phase ambiguity) but N d-level systems need 
d complex numbers to describe, instead of dN. This exponential extravagance is behind the power 
of quantum computers, which will be discussed briefly at the end of the course, if time permits. It 
also seemed intuitively wrong to many physicists in the early 20th century, most notably including 
Einstein. The objections of EPR |A. Einstein, B. Podelsky and N. Rosen, Physical Review, 47 777— 
780 (1935)] led to Bell’s theorem, which we saw in 8.05 and will review on pset 6. Here, though, 
we will consider a simpler problem. 


1.3. The problem of partial measurement 


Let us revisit the scenario where we measure part of an entangled state. Suppose that Alice and 
Bob each have a spin-1/2 particle in the singlet state 


_ |+) @l=) —|-) @|+) 
V2 


(The singlet is an arbitrary but nice choice. The argument would be essentially the same for any 
entangled state.) Imagine that Alice and Bob are far apart so that they should not be able to 
quickly send messages to one another. 

Now suppose that Alice decides to measure her state in the {|+),|—)} basis. Using the above 
rules we find that the outcomes are as described in Table 1. 


|) (3) 


Alice’s outcome joint state Bob’s state 


vu 
auc 
| 
II 
NI Nie 
ap 
& 
ae 
= 


Table 1: Outcomes when Alice measures her half of the singlet state (3) in the {|+),|—)} basis. 


What can we say about Bob’s state after such a measurement? It is not a deterministic object, 
but rather an ensemble of states, each with an associated probability. For this we use the notation 


{(p1, [P1)), +++» ms |Ym)) $ (4) 


to indicate that state |~;) occurs with probability p;. The numbers pj,...,pm should form a 
probability distribution, meaning that they are nonnegative reals that sum to one. The states |q;) 
should be unit vectors but do not have to be orthogonal. In fact, the number m could be much 
larger than the dimension d, and could even be infinite; e.g. we could imagine a state with some 
coefficients that are given by a Gaussian distribution. We generally consider m to be finite because 
it keeps the notation simple and doesn’t sacrifice any important generality. 

In the example where Alice measures in the {|+),|—)} basis, Bob is left with the ensemble 


((}10). G4) 6 


What if Alice chooses a different basis? Recall from 8.05 that if 7 € R® is a unit vector then a 
spin-1/2 particle pointing in that direction has state 


6 6 , 
|7) = |n;+) = cos mae + sin se | ). (6) 
Here (1,6, ¢) are the polar coordinates for 7; i.e. nz = sin@ cos ¢, ny = sin# sing, and ny = cos 6. 
The notation |n;+) was what we used in 8.05 and |7) will be the notation used in 8.06 in contexts 
where it is clear that we are talking about spin states. The orthonormal basis {|n;+), |m;—)} in 
our new notation is denoted {|7), | — 7)}. 

Suppose that Alice measures in the {|7),| — 7%)} basis. It can be shown (see 8.05 notes or 
Griffiths §12.2) that for any 7, 

V2 

Thus, for any choice of 71, the two outcomes are equally likely and in each case Bob is left with a 
spin pointing in the opposite direction, as described in Table 2. 


Alice’s outcome joint state Bob’s state 
PriJ=4 |r) @|-#) - it) 
Pr[—7] = 5 | — 7) @ |”) 7) 


Table 2: Outcomes when Alice measures her half of the singlet state (3) in the {|7),| — 7)} basis. 


This leaves Bob with the ensemble 


(208) (1-8) . 


Uh-oh! At this point, our elegant theories of quantum mechanics have run into a number of 
problems. 


e Theory isn’t closed. When we combine two systems with tensor product we get a new 
system, meaning a new vector space and a new Hamiltonian. It still fits the definition of 
a quantum system. But when we look at the state of a subsystem, we do not get a single 
quantum state, we get an ensemble. Thus, if we start with states being represented by unit 
vectors, we are inevitably forced into having to use ensembles of vectors instead. 


e Ensembles aren’t unique. Any choice of 7 will give Bob a different ensemble. We expect 
our physical theories to give us unique answers, but here we cannot uniquely determine which 
ensemble is the right one for Bob. Note that other choices of measurement can leave Bob 
with different ensembles as well; e.g. if Alice flips a coin and uses that to choose between 
two measurements settings, then Bob will have a distribution over four states, each occurring 
with probability 1/4. 


e Time travel?! If Bob could distinguish between these different ensembles (including the case 
in which Alice does nothing and he still holds half of an entangled state), then Alice could 
instantaneously communicate to Bob with her choice of measurement basis (or perhaps her 
choice of whether to measure at all or not). According to special relativity, there is a different 
inertial frame in which this process looks like Alice sending a message backwards in time. 
This rapidly leads to trouble... 


Fortunately density operators solve all three problems! As a bonus, they are far more elegant than 
ensembles. 


2 Density operators 


2.1 Introduction and definition 


We would like to develop a theory of states that combines randomness and quantum mechanics. 
So it is worth reviewing how both randomness and quantum mechanics can be viewed as two 
different ways of generalizing classical states. For simplicity, consider a classical system which can 
be in d different states labelled 1,2,...d. The quantum mechanical generalization of this would 
be to consider complex d-dimensional unit vectors while the probabilistic generalization would be 
nonnegative real d-dimensional vectors whose entries sum to one. These can be thought of as 
two incomparable generalizations of the classical picture. We are interested in considering both 
generalizations at once so that we consider state spaces that are both probabilistic and quantum. 
We summarize these different choices of state spaces in Table 3. 


classical quantum 
deterministic {1,.2.,d} |v) € C4 
s.t. (Wb) = 1 
probabilistic Pi,---,Pad = 0 ensembles? 
s.t. pp +...+pag=1 density operators? 


Table 3: Different theories yield different state spaces. 


What do we put in the fourth box (probabilistic quantum) of Table 3? One possibility is 
to put ensembles of quantum states, as defined in (4). Besides the drawbacks mentioned in the 
previous section, these also have the flaw that of involving an unbounded number of degrees of 
freedom. For example, let’s take a spin-1/2 particle (i.e. d = 2), so our quantum states are of the 
form c,|+) + c_|—). Then one such probability distribution is |+) with probability 1/3, |—) with 
probability 1/2 and Wott with probability 1/6. Another distribution is cos(@)|+) + sin(@)|—) 
where @ is distributed according to a Gaussian with mean 0 and variance o?. 

This works, but there are an infinite number of degrees of freedom, even if we start with a single 
lousy electron spin! Surely nature would not be so cruel. 

Another drawback with this approach is that different distributions can give the same measure- 
ment statistics for all possible measurements. As a result, many of these infinite degrees of freedom 
turn out to be simply redundant. 

To see how this works, suppose that we have a discrete distribution where state |wq) occurs 
with probability pg, for a = 1,...,m. Consider an observable A. The expectation of A with respect 
to this ensemble is: 


S- paltal Alva) = Spat [(a| Alva)] since tr has no effect on 1 x 1 matrices 
a=1 


a=1 
m 


= Datr [Alda) bal] cyclic property of the trace 


linearity of the trace 


II 
ot 
wR 
Tt 
Ms 
Ss 
oo 
— 
Ae 
— 
Q 
——— 


= tr[Ap] nice simple formula 


= (A, p) alternate interpretation 


We see that the measurement statistics are a function of the distribution only via the density 
operator p =  "", palwa) (Wal. This has about d? degrees of freedom, by contrast with O(d) 
degrees of freedom for known quantum states and with the oo degrees of freedom associated with 
ensembles. This already solves one of the problems of ensembles, namely their use of unlimited 
amounts of redundant information. 

This argument used only discrete distributions over C? but the extension to continuous distri- 
butions and/or infinite-dimensional states is straightforward. 


Facts about traces: Let X be a matrix of dimension m x n and Y a matrix of dimension n x m. 
In general these will not be square, but XY and YX both are, so their traces are well-defined. In 
fact, they are equal! A quick calculation shows 

m nm 


AY = SS XS A | (9) 
i=l j=l 


This is called the cyclic property of the trace because it is often applied to traces of long strings of 
matrices. For example, we can repeatedly apply (9) (using curly braces to indicate which blocks of 
matrices we are calling X and Y) to obtain 


[ABCD] = tr[DAB C_] = CDA, B ] = t[BCDA| (10) 
xX Y xX Y »¢ Y 


The trace can also be used to define an inner product on operators. Define 


(X,Y) = tr[XtY] = $0 XY. (11) 
49 


From this last expression we see that (X,Y) is equivalent to turning X and Y into vectors in 
the natural way (just listing all the elements in order) and taking the conventional inner product 
between those vectors. 


2.2 Examples 
2.2.1 Pure states 


If we know the state is |W), then the density matrix is |y)(w|. Observe that there is no phase 
ambiguity (|) + e®|w) leaves the density matrix unchanged) and each |w) gives rise to a distinct 
density matrix. Such density matrices are called pure states, and sometimes this terminology is 
also used when talking about wavefunctions, to justify not using the density matrix formalism. By 
contrast, all other density matrices are called mized states. 


2.2.2 Spin-1/2 pure states 


Let us consider the special case of pure states when d = 2, corresponding to a spin-1/2 particle. If 
[a= cos(8)|+) + sin($)e*?|—) then 


nee cos?(8) cos($) sin($)e~** 
Ia) (ail = a 
cos(§) sin($)e"? sin*($) 
1 2(9) 1 —ip 
0 cos 0 0 6/0 e 
=a ie oi (2 +cos=sin= [| | 
a4 0 sin?(9) — 3 a. vee 50 
I cos(@ sin(0 : 
=F 4 BO, 5 OO) cos(s)oe + sin(d)oy) 
Bee secs 
7. 
This result is beautiful enough to frame. 
nyo L4+7-6 
|) (#]| = —— (12) 


With this in hand we can return to the example of Alice measuring half of a singlet state. Whatever 
her choice of 7, Bob’s density matrix is 
1 1 I4+n-o 1 I-‘fi-o 


layin] + S| -a)(-aj=5 RS 45 Ee Le (13) 


This rules out their earlier attempts at instantaneous signaling (and later we will prove this in more 
generality). Bob’s density matrix fully determines the results of any measurement he makes, and 
it is independent of Alice’s choice of i. 


2.2.3. The maximally mixed state 


If {|v1),...,|va)} are an orthonormal basis, and each occurs with probability 1/d, then the resulting 
density matrix is 


Lee I 
p= i vi) (vi] = a’ (14) 


independent of the choice of basis. This is called the maximally mixed state. The previous example 
was the d = 2 case of this: a 1/2 probability of spin-up and 1/2 probability of spin-down results in 
the same density matrix, no matter which direction “up” refers to. 

The continuous distribution over all unit vectors in C4 also yields the same density matrix, 
although this is a harder calculation. 


2.2.4 Multiple decompositions 


Consider the distribution where |+) occurs with probability 2/3 and |—) with probability 1/3. The 
density matrix is 


2 
2 1 5 O 
I+)(+] + sl-)(-] = [7 (15) 
3 3 9 2 
3 
Now consider the distribution 
2 1 . ba 
ju) = [2h ‘E ) with probability 1/2 
2 1 : Pe 
\w2) = ‘er ‘E ) with probability 1/2 
The density matrix is now 
2 v2 2 V2 2 
1 1 a es cee a ree ec cee oe 
Welt swert=5(% ]ta( ie fale, (16) 
3 3 3 3 3 


One lesson is that we shouldn’t take the probabilities pg too seriously; i.e. they are not uniquely 
determined by the density matrix. Neither is the property of the states in the ensemble being 
orthogonal. 


2.2.5 Thermal states 


Introduce a Hamiltonian H = “_, E;|é)(i]. We can think of this classically, as saying that state é 
has energy E;. In this case, the Boltzmann distribution at temperature T is p; = e~°”'/Z, where 
Z= es e "i 8 = 1/kpT and kg = 1.380688 - 10-*°.J/K is Boltzmann’s constant. In the 
quantum setting, “state 7” is replaced by |z). The resulting density matrix is 


d d  .-BEi\;\ (; —BH —BH 
S os oe \i)(2] e€ € 
: ii au Z Z tre SH en 


This is known as the Gibbs state or the thermal state. It describes the state of a quantum system 
at thermal equilibrium. 


One specific example comes from NMR. Consider a proton spin in a magnetic field, say a 11.74 
Tesla field in the Z direction. At this field strength, the proton spin will experience the Hamiltonian 
H = — woo, where wo © 500MHz. (In fact, if you buy a 11.74T superconducting magnet, the 
vendor will probably call it a “500 MHz” magnet for this reason. It could also reasonably be called 
a 500K magnet because of its price.) The thermal state is 


_ PH oP ml4y(4| + oF 40|-)(— 
i tre BH a ef Wo e—8 Wo 
~ A+B wo)l+(+]+ U8 wo)|-)(-| 
2 
I Ww 
"ee : yo 


Let T = 300K, which is close to room temperature. Then 1/8 & 6.25THz and so 8 wo & 107+. 
This is the “polarization” of the state. In an NMR experiment we can think of a 5 +1074 fraction 
of spins aligning with the external field and a 5 —10~‘ fraction of the spins anti-aligning with the 
external field at thermal equilibrium. This means that observables are effectively attenuated by a 
factor of (in this case) 1074. 


3. The general rule for density operators 


We know what makes a vector a valid probability distribution or a valid quantum wave-vector. 
What makes an operator a valid density operator? One answer is “p is a valid density operator if 
there exists p1,...,Pm,|¥1),---,|Wm) such that p= =, Pala) (Wal, P1,---,Pm > 0, pit+-...+Pm =1 
and (Wa| =)1 for each i.” This is somewhat unsatisfactory if we want to build a theory where the 
density operators are the fundamental objects. 

Fortunately, there is a simple answer. 


Theorem 1. Jf ad x d matrix p is a density matrix for some ensemble of quantum states then 
1. trp =1. 
2. p= 0. 


Conversely, for any d x d matriz p satisfying these two conditions, there exists an ensemble 
{Pa; |Wa) }i<a<m such that p= 1, pala)(Wal. Here m can be taken to be the rank of p. 


The inequality » = 0 means that p is positive semidefinite, which is defined to mean that 
(w|p|w) > 0 for all |w). It is the matrix analogue of being nonnegative. 
3.1 Positive semidefinite matrices 


We say that a square matrix A is positive semidefinite if (w|A|W) > 0 for all |). Physically, A 
might be an observable that takes on only nonnegative values. Or it might be a density matrix. If 
furthermore A is Hermitian, then there are three equivalent ways to characterise the condition of 
being positive semidefinite. 


Theorem 2. Jf A= Al then TFAE (the following are equivalent) 
1. For all |), (b|Ald) > 0. 


2. All eigenvalues of A are nonnegative. 


3. There exists a matrix B such that A= B'B. (This is called a Cholesky factorization.) 


To get intuition for this last condition, observe that for 1 x 1 matrices, it is the statement that 
a real number x > 0 iff « = z*z for some complex z. 


Proof of Theorem 2. Since A is Hermitian, we can write A = a Ai|e;) (e;| for some orthonormal 


basis {|e1),...,]ea)} and some real \j,..., Ad. 
(1 — 2): Take |W) = |e;). Then 0 < (WlAlw) = (e;|Ale;) = Ai. 


(2 > 3): Let B= ie Vile: (e;|. As an aside, one can show that B satisfies A = B'B if and 
only if B= “, VAjle;)(fi| for some orthonormal basis {|f1),...,|fa)}. Sometimes we say that 
B= VA, by analogy to the scalar case. 


(3 + 1): For any |), let |y) = Bl). Then (|A|b) = (Y|BT BI) = (yly) = 0. 


3.2 Proof of the density-matrix conditions 


Here we prove Theorem 1. Start with an ensemble {p;,|¥i)}1<i<m- Define p = ey Dil Wa) (Wil. 
e Then p! = 1 Pi Yi) (Wi| = p, since p; = p;. Thus p is Hermitian. 
e Next, define B= 7, \/pilvi)(i|. Then p = B'B, implying that p > 0. 
e Finally, trop = 2, pitr|vs) (Wil = Ly pi = 1. 


To prove the other direction, suppose that trp = 1 and p > 0. By Theorem 2, p = - Ail ex) (e;| 
for {|e1),...,]eg)} an orthonormal basis and each \; > 0. Additionally trp = 3 Ay = 1,. Thus 
we can take p; = ; and now p is the density matrix corresponding to the ensemble {pj, |e;) }i<i<a- 
If rank p < d, then the sum only needs rank p terms. 


3.3 Application to spin-1/2 particles: the Bloch ball 


The geometry of the set of density matrices is unfortunately not quite as simple as the state spaces 
we have encountered previously. Pure quantum states form a ball (modulo the phase ambiguity) 
and probability distributions form a simplex. Intersections of planes (such as tr[p] = 1) with the 
set of positive semidefinite matrices are called spectrahedra, and apart from the wonderful name, I 
will not explore their general properties here. 

However, the case of d = 2 is indeed simple and elegant. Given a Hermitian 2 x 2 matrix A, 
when is it a valid density matrix? First, if it is Hermitian then we can express it as 


a aol + ayo, + ago02 + azaz 


for some real numbers ao, @1, @2, a3. The factor of 2 in the denominator is arbitrary, but we will see 
later that it simplifies things. If A is a density matrix, then 1 = tr[A] = a9. Thus, A = Hae with 
@ = (a1, a2,a3). We saw in 8.05 that eig(@-¢) = +|a|. Thus eig(A) = 114) A is psd iff these are 
both nonnegative, which is true iff |@| < 1. 

This proves that the set of two-dimensional density matrices is precisely equal to the set 


P+ad 


Silas . (18) 


10 


Geometrically this looks like the unit ball in R*. The pure states form the surface of the ball, 
corresponding to the case |@| = 1. The maximally mixed state [/2 corresponds to @ = 0. In 
general, |@| can be thought of as the “purity” of a state. 

This set is called the Bloch ball. The unit vectors at the surface are called the Bloch sphere. 
These have nothing to do with Bloch states or Bloch’s theorem (which arise in the solution of 
periodic potentials) except for the name of the inventor. 

Beware also that for d > 2, the set of density matrices is no longer a ball and there is no longer 
a canonical way to quantify “purity.” However, notions of entropy do exist and are used in fields 
such as quantum statistical mechanics. 


4 Dynamics of density matrices 


4.1 Schrodinger equation 


The Schrodinger equation states that 


< Wy) =“ HW) (198) 
< (wl = “(wit (19) 
SW) = —* (EER) — wl) = EL, bo) (19¢) 
5 Pel a = — "HTS pal) al (199) 
We conclude that the density matrix evolves in time according to 
i p= [Hol) (20) 


This is reminiscent of the Heisenberg equation of motion for operators, but with the opposite sign 


i o An = [Ay, Hy). (21) 
t 
One way to explain the different signs is that states and observables are dual to each other, in the 
sense that they appear in the expectation value as (A, p). 

Another way to talk about quantum dynamics is in terms of unitary transformations. If a 
system undergoes Hamiltonian evolution for a finite time then this evolution can be described by 
a unitary operator U/, so that state |q) gets mapped to U/|w). In this case |w)(w| is mapped to 
U|w)(w|Ut. By linearity, a general density matrix p is then mapped to U pli". 


4.2 Measurement 


A similar argument shows that if we measure p in the orthonormal basis {|v1),...,|va)}, then the 
probability of outcome j is (v;|p|v;) and the post-measurement state is |v;)(v;|. The fastest way 
to see this is to consider the observable |v;)(v;| which has eigenvalue 1 (corresponding to obtaining 
outcome |v;)) and eigenvalue 0 repeated d— 1 times (corresponding to the orthogonal outcomes). 
Then we use the fact that (A) = tr[Ap] and set A = |v;)(v;|. 


11 


An alternate derivation is to decompose p= ""_, palWa) (Wal. Then Pr[jla] = |(vj|Pa)|? and 


Prij] = D_ Pa Priylal 


Ms iM: 


Pal (¥;|a) |” 


a=1 


Ms 


Pa(vj|ha) (Yalv;) 


a=1 


= (v5 sealtionved |v;) 


a=1 


= (vj|plv;) 


It should be reassuring that, even though we used the ensemble decomposition in this derivation, 
the final probability we obtained depends only on p. 

What if we forget the measurement outcome, or never knew it (e.g. someone else measures the 
state while our back is turned)? Then p is mapped to 


d d 


> v;|p|vj) |v; )(vy| = S>|v5) (uj |plv;) (yl. (22) 


j=l j=l 


Here it is important to note that density matrices, like probability distributions, represent not only 
objective states of the world but also subjective states; in other words, they describe our knowledge 
about a state. So subjective uncertainty (i.e. the state “really is” something definite but we don’t 
know what it is) will have implications for the density matrix. 


If we now write p from (22) as a matrix in the |v1),...,|va) basis, this looks like 
pis 9 0 
0 2,2 
0 
0) ws O- pad 


Can we unify measurement and unitary evolution the way that we have unified the probabilis- 
tic and quantum pictures of states? For example, how should we model an atom in an excited 
state undergoing fluorescence? We will return to this topic later when we discuss open quantum 
systems and quantum operations. However, already we are equipped to handle the phenomenon of 
decoherence, which is the monster lurking in the closet of every quantum mechanical experiment. 


4.3 Decoherence 


Unitary operators correspond to reversible operations: if U is a valid unitary time evolution then 
so is Ut. In terms of Hamiltonians, evolution according to —H will reverse evolution according to 
HT. But other quantum processes cause an irreversible loss of information. Irreversible quantum 
processes are generally called “decoherence.” This somewhat imprecise term refers to the fact that 
this information loss is always associated with a loss of “coherence” and with quantum systems 


12 


becoming more like classical systems. In what follows we will illustrate it via a series of examples, 
but will not give a general definition. 

Let’s warm up with the concept of a mizture. If state |W) occurs with probability pa, then 
the density matrix is, pala) (Wa|. But what if we have an ensemble of density matrices? e.g. 
{(p1, P1);--+; (Pm; Pm)} Then the “average” density matrix is 


m 
pa Wale: (23) 
a=1 


We can use this to model random unitary evolution. Suppose that our state experiences a 
random Hamiltonian. Model this by saying that unitary U4, occurs with probability pg for a = 
1,...,m. This corresponds to the map 


pr> SS pallapllt. (24) 


a=1 


Let’s see how this can explain how coherence is lost in simple quantum systems. Suppose we 
start with the density matrix 


P++ P+ 
P-,+  P-,- 


and choose a random unitary to perform as follows: with probability 1—p we do nothing and with 
probability p we perform a unitary transformation equal to 0,. This corresponds to the ensemble 
of unitary transformations {(1 — p,I),(p,oz)}. The density matrix is then mapped to 


/ 


p' =(1—p)IpI' + pozpo! 


P++  P+— P++ ~P+,— 
= (=p) +p 
faces Pegs Oaks, Pee 
= P+ (1 — 2p)p+,- 
(Li Dp) p set _— 


If p = 0 then this of course corresponds to doing nothing, and if p = 1, we simply have p’ = o,paz. 
In between we see that the diagonal terms remain the same, but the off-diagonal terms are reduced 
in absolute value. The diagonal terms correspond to the probability of outcomes we would observe 
if we measured in the Z basis, and so it is not surprising that a Z rotation would not affect these. 
However, the off-diagonal terms reduce just as we would expect for a vector that is averaged with a 
rotated version of itself. If p = 1/2, then the off-diagonal terms are completely eliminated, meaning 
that all polarization in the ¢ and gy directions has been eliminated. One way to see this is that 
the < and ¥ polarization of o,p0, is opposite to that of p. Thus averaging p and o,pa, leaves zero 
polarization in the <-Yy plane. 
With a series of examples, I will illustrate that: 


e Decoherence can be achieved in several ways that look different but have the same results. 
e Decoherence destroys some quantum/wave-like effects, such as interference. 


e This also involves the loss of information, often of phase information. 


13 


5 Examples of decoherence 


5.1 Looking inside a Mach-Zehnder interferometer 


This example is physically unrealistic (in one place) but makes the decoherence phenomenon clearest 
to see. 
A Mach-Zehnder interferometer is depicted in Fig. 1. 


Beam-Splitter 


Detector 1 


Mirror 


Detector 2 


Image by MIT OpenCourseWare, adapted from the wikipedia article on Mach-Zehnder interferometers. 


Figure 1: Mach-Zehnder interferometer. Image taken from the wikipedia article with this name. 


At each point the photon can take one of two possible paths, which we denote by the states 
|1) and |2). Technically |1) means photon number in one mode and zero in the other modes, and 
similarly for |2). Also, we use |1),|2) to first denote the two inputs to the first beam splitter, then 
the two possible paths through the interferometer, and finally the two outputs of the second beam 
splitter leading to the detectors. 

Each beam splitter can be modeled as a unitary operator. If they are “50-50” beam splitters, 
then this operator is 


1 
Ubs V2 a 
Thus, a photon entering in state |1) will go through the first beam splitter and be transformed into 
the state BP corresponding to an even superposition of both paths. Assuming the paths have 
the same length and refractive index, it will have the same state when it reaches the second beam 
splitter. At this point the state will be mapped to 
+|2) _ |1) + |2) + |) — 2) 
J/2 2 
and the first detector will click with probability 1. 

This is very different from what we’d observe if a particle entering a 50-50 beam splitter chose 
randomly which path to take. In that case, both detectors would click half the time. 

The usual reason to build a Mach-Zehnder experiment, though, is not only to demonstrate the 
wave nature of light, but to measure something. Suppose we put some object in one of the paths 
so that light passing through it experiences a phase shift of 6. This corresponds to the unitary 
transformation 


ieee =|1) 


Uph = : (25) 


Our modified experiment now corresponds to the sequence UpsUpnUps, which maps |1) to 


|1) + |2) e*|1) +|2) _ ef 1) +e? |2) + |1) — |2) 
UpsUpnUps|1) = UbsUpnh ——=— = Ubs = ‘ 


The probability of the first detector clicking is now |e" 2 = cos*(0/2). 

Now add decoherence. Suppose you find a way to look at which branch the photon is in without 
destroying the photon. (This part is a bit unrealistic, but if we use larger objects, then it becomes 
more reasonable. See the readings for a description of a two-slit experiment conducted with Céo 
molecules.) If we observe it then we will find that regardless of the phase shift 6: 


e the photon is equally likely to be in each path; and 
e each detector is equally likely to click. 


Our measurement has caused decoherence that has destroyed the phase information in 0. 


5.2 Spin rotations in NMR 
Start with a spin-1/2 particle in the |+) state. Apply H = S, for time t = 7/2, so that 


Applying U once yields U|+) = HY) and applying U a second time yields |—) (calculation 


|= 
J2 
omitted). 
Suppose that we measure in the {|+),|—)} basis after applying the first U. Then each outcome 
occurs with probability 1/2 and the resulting density matrix is 


1 1 i 

sit + sI-X-l = 5: 
Applying U again leaves the density matrix unchanged. Decoherence has destroyed the polarization 
of the spin. 

In actual NMR experiments, we have a test tube with 102° water molecules at room temperature 
and we are not going to measure their individual spins. Instead, suppose that two nuclear spins get 
close to each other and interact briefly. Suppose that the first spin is in state p and the second spin 
is maximally mixed (i.e. density matrix I/2). Suppose that they interact for a time T according 
to the Hamiltonian 


H= ae ® Sz. 


(Why not §YO.§@ = ss S; ® S;? This is a consequence of perturbation theory: if there is a 
large S, ®I+1I®S, term in the Hamiltonian, then the S, ®S, and S, ® S, terms are suppressed 
but the S, @ S, term is not.) This is equivalent to the first spin experiencing a Hamiltonian \S, 
if the second spin is in a |+) state. and experiencing —AS, if the second spin is in a |—) state. 


15 


Averaging over these, the first spin is mapped to the state 


ite 1 — HAS: pe htAS: 4 1 A FtS: pe— jtASs 
2 2 
1 e—tAt/2 0 ert /2 0 1 eirt/2 0 e iat /2 0 
=> Pp , ce . Pp ; 
2 0 eirt/2 0 e 7 iAt/2 2 0 e iat /2 0 ert /2 
7 P+ cos(At/2)p4— 
cos(At/2)p_+ p—— 


This doesn’t complete destroy the off-diagonal terms, but attenuates them. Here we should think 
of At as usually small. 

If we average over many such interactions, then this might (skipping many steps, which you 
will explore on the pset) result in a process that looks like 


p=-— (26) 
p-+ 90 


where TJ is the decoherence time, sometime also called the dephasing time for this kind of decoher- 
ence. 

If this is 75, then is there also a 7? Yes, T refers to a different kind of decoherence. In NMR, 
there is typically a static magnetic field in the Z direction, which gives rise to a Hamiltonian of the 
form H = —7BS,. From this (together with the temperature) we obtain a thermal state pthermal 
described in Section 2.2.5. The process of thermalization is challenging to rigorously derive from 
the Schrédinger equation but it is usually sufficient to model it phenomenologically. Suppose that 
according to a Poisson process with rate 1/T;, the spin is discarded and replaced with a fresh spin 
in the state Pthermal- Then we would obtain the differential equation 


dl 
= = _ ermal): 27 
pam (P — Pthermal) (27) 


Of course, there is another source of dynamics, which is the natural time evolution from the 
Schrédinger equation: » = —+|H, p|. Putting this together, we obtain the Bloch equation: 


: a 1 0 P+— 
p= [H, p| T. (p Pthermal) _ T (28) 
1 2\ gg. <0 
If we write p = ira | then (28) becomes 
OG > 
— = Ma+b 29 
Ot a =. ? ( ) 


for M, b to be determined on a pset. 

(Why do I keep talking about NMR, and not ESR (electron spin resonance)? The electron’s 
gyromagnetic ratio is about 657 times higher than the proton’s so its room-temperature polarization 
is larger by about this amount, and signals from it are easier to detect. However, it also interacts 
more promiscuously and thus often decoheres quickly, with T2 on the order of microseconds or 
worse in most cases. So when you get a knee injury, your diagnosis will be made via your nuclei 
and not your electrons.) 


16 


5.3 Spontaneous emission 


Consider an atom with states |g) and |e), corresponding to “ground” and “excited.” We will also 
consider a photon mode, i.e. a harmonic oscillator. Suppose the initial state of the system is 
|v) atom ® |0) photon with |¢) = ci|g) + c2le). These will interact via the Jaynes-Cummings Hamilto- 
nian 


H = Q\g)(el @ a! + |e) (g| @ 4). (30) 
(For simplicity we have left out some terms that are usually in this Hamiltonian. This Hamiltonian 
can be derived using perturbation theory, as we discussed on a pset.) Suppose that the atom and 
photon field interact via this Hamiltonian for a time t. Assume that 6 = Ot is small and expand 
the state of the system in powers of 6: 


_iHt 2 


e® |b) @ |0) = (e1lg) + c2le)) ® |0) — tdea|g) ® |1) — ale) © [0) + O(6) (31) 


Now measure and we see that with probability |c2|?6? the photon number is 1 and the atom is 
in the state |g). In this case, we observe an emitted photon and can conclude that the atom must 
currently be in the state |g). (It is tempting to conclude that we know it was previously in the 
state |e). This sort of reasoning about the past can be dangerous. In fact, all we can conclude is 
that cz must have been nonzero.) 

With probability |c,|? + (1 — 6”)|c2|? = 1 — |c2|?52 we observe 0 photons and the state is 


cilg) + 1 = 4" )eale) 


again, up to O(6°) corrections. If we repeat this for long enough then we also end up in the state 
|g). This is because if we watch an atom for a long time and it never emits a photon we can 
conclude that it’s probably in the ground state. 


6 Multipartite density matrices 


In Section 1.3 I complained that pure-state quantum mechanics is not closed under discarding 


subsystems. Here we will see that density matrices solve this problem. More generally we will 
extend the formalism of density matrices to handle composite quantum systems. 


6.1 Product states 


Suppose that we have two systems, called A and B, with density matrices p and o respectively. I 
claim that their joint state is p®o. 

One way to see this is by explicitly decomposing p =, pilai)(ai|,0 = 5 125) (B5| and 
considering these as independent ensembles {(p;,|a;))} and {(q,|5;))}. According to the rule for 
independent probability distributions the probability of finding system A in state |a;) and system 
B in state |(;) is pj-q;. In this case the joint state is |a;) ®|G;). This corresponds to the ensemble 
{(pigj, |i) ® |B;))} which has density matrix 


S > pigj (les) ® |Bj)) Kaul ® (85|) = S- pigglai) (au| ® |B;)(8;| = p@o. (32) 
tJ tJ 
Therefore the product rule for density matrices can be inferred from the product rule for pure 
states. 


17 


We can also derive it from observables. If we measure observable A on the first system then 
this corresponds to the observable A@I onthe composite system; likewise B on the second system 
corresponds to I ® B on the joint system. Their product is 4 @ B. This arises for example when 
the dipole moments of two spins are coupled and the Hamiltonian gets a term proportional to 
1-8 = Sp @Syt Sy ® Sy+S,@ Sz. Let w be the joint state of a system where the first particle 
is in state p and the ee is in state o. The expectation of A @ B with respect to w should be 


tr[pAltr[aB] = tr[(p @ 0)(A@ B)]. (33) 


Since this is equal to tr[w(A @ B)] for all choices of A, B, we must have that w = p@o. 


6.2 Partial measurement and partial trace 


Density matrices were introduced by the fact that measuring one part of a larger system leaves 
the rest in a random state. As a result, pure-state quantum mechanics is not a closed theory; if 
the state of the joint system AB is pure, then it is possible that the states of A and B are not 
themselves pure. However, if there is a density matrix p4? describing the joint state of systems A 
and B then we should be able to define density matrices for the individual systems. Indeed any 
observable X on system A should have a well-defined expectation value, and these can be used 
to define a reduced density matrix for system A. Mathematically we denote the density matrix of 
system A by p4 and define it to be the unique density matrix satisfying 


tr[o"X] = tr[p48(X @ Dj, (34) 


for all observables X. (It is an instructive exercise to verify that there is always a solution to 
(34) and that it is unique.) Similarly we can define the state of system B to be p? satisfying 
tr[p? X] = tr[p4? (I @ X)]. 

Expanding (34) in terms of matrix elements yields 


dF a!Xa, a’ = 5 Pad. ed: a’ Sb, b= Se pi Pab, lb aia! ; (35) 


a,b,a’ ,b’ a,a’,b 


Since this must hold for any X, we have 


A 
Pa,a’ = (tra[p a, a! = D Pa pe ab: (36) 


This looks like taking a trace over the B subsystem (i.e. summing over the b = 0’ entries) while 
leaving the A system alone. For this reason we call the map from p4? ++ p4 the “partial trace” 
and denote it trp; i.c. p4 = trp[p4?]. The partial trace is the quantum analogue of the rule for 
marginals of probability distributions: p* (2) = ‘i p** (x,y). 

A similar equation holds for p? = tr4{p4?] which can be expressed in terms of matrix elements 
as 


phy = (tralpl)oe = Ds Pat en ar (37) 


If A and B have dimensions d, and dg respectively and M, denotes the set of d x d matrices, 
then tra : Majdg —~ Ma, and trp : Mayda, — Ma, are linear maps defined by 

tra[lo)(B] ® |y)(4]] = (al B) - |v) (4 

tralla)(B] @ |y)(4]] = (716) - la) (| 


18 


If {|a)} and {|b)} are orthonormal bases then tra[|a)(a’| ® |b)(6'|] = da,a7/b) (b'| and trglla)(a’| ® 
|b) (b'|] = 56,01) (a"|. 


Let’s illustrate this by revisiting the example of spontaneous emission from Section 5.3. Suppose 


wae EE ea), 


where H= Q(|g)(e| @ at + |e)(g| @ a). H|g,0) = 0 and H acts on the {\e, 0), |g, 1)} subspace as a 
rotation. Thus 


1 oes 
|p) = Sale 0) + woo )le,0) — ésin(@)|g, 1). 


The corresponding density matrix is 


(9.0 (e,0] doll fell 
|g, 0) : 5 3 cos() 3 sin(@) 0 
|e,0) | 5 cos(@) 5 cos“ (8) 5 cos(#)sin(#) 0 
[e) | = l9,1) | =! sin(@) 5! sin(@) cos(@) 5 sin? (4) 0 (38) 
le, 1) 0 0 0 0 
The reduced state of the atom is 
(g| (e| 
—_ |g) 3 + 5 sin? (0) 5 08(0) 
tphoton|) (y| _ le) 1 cos(0) U cos?(0) (39) 
and the reduced state of the photon is 
(0| (1| 
14 1 Ege? tgs 
0) 53+ 5cos*(0) 5sin(0) (40) 


tYatom|P) (Y| = |1) 3 sin(0) 5 sin? (9) 


(The decorations surrounding the above matrices are meant as reminders of which basis elements 
the rows and columns correspond to.) 


6.3 Purifications 


One way density matrices can arise is via subjective uncertainty; i.e. we don’t know what the state 
is, but it “really” is pure. If so, we might imagine that density matrices would be useful for a 
quantum theory of statistics or information, but are not essential to quantum physics. However, 
density matrices also arise in settings where the overall state is known exactly. We saw this earlier 
where Bob could not distinguish his half of a singlet from a uniformly random state. Conversely, 
a uniformly random state cannot be distinguished from half of a singlet, with the other half in an 
unknown location. This is in fact only a representative example of the general rule that any density 
matrix could arise by being part of an entangled state. 

First, let 7b) = nan a ai,j|t) ® |j). If Bob measures his system, he obtains outcome j with 
probability pj; =; |ai,|? and the residual state for Alice is, a4,;|4) /,/pj. Her density matrix is 


dp da (i! da da dp 
(Ay aglt i) if= a t 
DP; = Dd Dd rise |) (| = aa. 
j=l i=1 V/=1 j=l 
ee~__- 


=(aat); 4 


19 


What if Alice measures? Working this out is a good exercise. The answer is (ata)’. 
By Theorem 2 any density matrix p can be written as aa‘ for some matrix a. It remains only 
to check the normalization to ensure that |7) is a valid state: 


1 = trp =traal = Ss" lox 5|?. 
tj 


This means that if we produce p in the lab, we can never know whether the state is mixed 
because of uncertainty about which pure state it is, or because it is entangled with a particle that 
is out of our control. 


20 


8.06 Spring 2016 Lecture Notes 


4. Identical particles 


Aram Harrow 


Last updated: May 19, 2016 


Contents 

1_Fermions and Bosons 1 
1.1 Introduction and two-particle systems ........ 2. ee ee 1 
Wi ERI ce ck oh die A EA es we , we Gk BH ee Ye ee 3 
Lo Won-inberaching Paricles.. 0 = be 4 6 ek Bebe do os ok Eoe Ba & See Phat 5 
14 Nonesere tempiratie « . 2.45 fe he ee Rae hE ER A ae ee ee eS 7 
I Someone Parnicles: -. 6. se ek ee oe a ee ke a ee Soe eek ee ee oe it 
1.6 Emergence of distinguishability ............. 0.0.00 0 eee ee ee 9 

2 Degenerate Fermi gas 10 
Ol sero i ek a es oe ea ee A ee BR SSE Ee 10 
ee Wihieserearves foe ke eS ee oe ee a a See ee eee Be 12 
2.0. Electrons in.a periodic potential «4. 6s. 6. ea kh Ye ae ee ee A 16 

3 Charged particles in a magnetic field 21 
So. lhe Pali Hamilbomian . .4. 224. 65622444 2 aa Re ae EEG aa 21 
Boo. lgridanlegele oe en a xk i A a ek eH Be ew Oe Se aw Re Oe 23 
3.3 The de Haas-van Alphen effect .......... 0.000 eee ee ee 24 
a4 lnveser Quantum Hall Bieet. ce es ee ee ye ee ee ee eee ok 27 
D5. Perini Bohr Bnet. 6 ke ee ke ee ok we He a, Bow De Ha ee es 33 


1 Fermions and Bosons 


1.1. Introduction and two-particle systems 


Previously we have discussed multiple-particle systems using the tensor-product formalism (cf. 
Section 1.2 of Chapter 3 of these notes). But this applies only to distinguishable particles. In reality, 
all known particles are indistinguishable. In the coming lectures, we will explore the mathematical 
and physical consequences of this. 

First, consider classical many-particle systems. If a single particle has state described by 
position and momentum (7,p), then the state of N distinguishable particles can be written as 
(71, Pi, 72, P2,---,7N, Pn). The notation (-,-,...,-) denotes an ordered list, in which different posi- 
tions have different meanings; e.g. in general (71, 01,72, D2) Z (72, D2, 71, P1)- 


To describe indistinguishable particles, we can use set notation. For example, the sets {a, b, c} 
and {c,a,b} are equal. We can thus denote the state of N indistinguishable particles as 


{(71, 91), (72, P2),---, (FN, Dw) }- (1) 


(We can either forbid two particles from having exactly identical positions and momenta, or can let 
{...} denote a multiset, meaning a set with the possibility of repeated elements.) This notation is 
meant to express that the particles do not have individual identities, and that there is no physical 
or mathematical difference between what we call particle 1, particle 2, etc. 

In the quantum mechanical case, suppose we have N particles each with single-particle state 
space given by a vector space V. If the particles were distinguishable the composite space would 
be given by V@N = V @---@V. For example, the spins of N spin-1/2 particles have state space 
(C?)®". The wavefunction of a N particles in 3-d is a function ~)(7,...,7v) that maps R®% to 
C. If S(IR*) denotes well-behaved functions on R® (formally called the Schwartz space), then this 
N-particle state space is equivalent to S(R°)®%. If this were a wavefunction of indistinguishable 
particles, then it is natural to guess that it should not change if we exchange the positions of the 
particles, e.g. swapping 7; and rg. This turns out not to be quite true, since it may be that 
swapping two positions could result in an unobservable change, such as multiplying by an overall 
phase. 

To be more concrete, consider the case of two indistinguishable particles. Then we should have 
\W(F1, 72)| = |b(%2,71)|, or equivalently 


WF, 72) = ee (7, 71) (2) 


for some phase e’”. It is somewhat beyond the scope of this course to explain why the phase should 
be independent of 71,72, but I will mention that it relies on being in > 3 spatial dimensions and 
that richer behavior exists in 1 and 2 dimensions. A more general way to express (2) is by defining 
the swap operator F by the relation i 


F(|a) ® |8)) = |8) ® a) (3) 
for any single-particle states |a),|5). Then (2) is equivalent to 
Fly) = ely). (4) 


6 _ 


Since F? = I, its eigenvalues can only be +1, and so we must have e? +1. The corresponding 
eigenspaces are called the symmetric and antisymmetric subspaces, respectively, and are denoted 


Sym? V = {|b) ¢V @V: Fly) = |¥)} (5a) 
Anti? V = {\¥) eV @V: Flv) = —|d)} (5b) 


Particles whose state space (for N = 2) is Sym? V are called bosons and those with state space 
Anti? V are called fermions. The spin-statistics theorem states that particles with half-integer spin 
(1/2, 3/2, etc.) are fermions and that particles with integer spin (0, 1, etc.) are bosons. The proof 
of this involves field theory (or at least the existence of antiparticles) and is beyond the scope of 
8.06 (but could conceivable be a term-paper topic). 

To find a basis for the symmetric and antisymmetric subspaces, we can construct projectors 
onto them, and apply them to a basis for V @ V. Since F' has eigenvalues +1, Psym = HE will 
project onto the +1 eigenspace (i.e. the symmetric subspace) and Panti = LF will project onto the 


2 
-1 eigenspace (the antisymmetric subspace). The overall space V ®V has a basis consisting of states 


|) ®|6). We can assume that |a),|3) came from some orthonormal basis for V, so that in particular 
they are either equal or orthogonal. Applying Psym we get |a) ® |a) (if |a) = |8)) or (O16) +18) Slop 


eels) 
V2 


(if |), |G) are orthogonal). The latter state can be normalized to obtain : Similanly if 


|B) or ee poe after normalizing if |a),|G) are 


we apply Panti to |a) @|3) we obtain 0 if |a) = 
orthogonal. These states are all orthogonal to each other except for when we exchange |q) and |{3), 
in which case we get back either the same state (symmetric subspace) or the same state multiplied 
by -1 (antisymmetric subspace). 

If V is d-dimensional and has basis {|1),...,{d)} then V @ V is d?-dimensional and has basis 
{|1) @ |1),|1) @ |2),...,|d) @ |d)}. Sym? V has basis 


|a) ® |8) +18) ® la) 
v2 


where we have arbitrarily assumed that a < 8. We could have equivalently chosen a > 8, but 
should not do both so that we do not double-count the same states. Similarly Anti? V has basis 


|x) @ |B) — |B) @ la) | 
{ a isa<psal. (7) 
iat 


This has (4) = —j— elements, corresponding to the number of ways of choosing two elements 
aa) 


(lo) la) 1sa<duf isa<psah, (6) 


elements. We can check 


from a d-element set. Similarly the basis for Sym? V has d+ é i 


that the dimensions add up: atar}) + ae) = d’. (But beware an this situation is unique to 
N = 2. For N > 2, V® contains states that are neither completely symmetric nor completely 
antisymmetric. The situation then is beyond the scope of 8.06, but “Schur-Weyl duality” is the 
phrase to google to learn more.) 


Example: spin-1/2 particles. The simplest case is when d = 2. In this case, we use spin 
notation and describe the single-particle basis with {|+),|—)}. The resulting basis for Sym? C? is 


{| ++), Hae \} and the basis for Anti? C? is {Hoe These are referred to as the 


triplet and singlet respectively. 


1.2 WN particles 


Again if there are N distinguishable particles, then their joint state-space V®%, where V is the 
single-particle state space. A basis for this space is given by vectors of the form |a ) @---® Jay). 
To define the symmetric and antisymmetric subspaces, define the operator F’! to swap tensor 
positions 7 and 7, i.e. if 2 <j then 


F401) @-+-@ aw) = |a1) @-+-@lay—1) @ lay) ® |ai41) ®---®laz_1) ® lai) ® az41) @--- aw) (8) 


(and the definition is similar if 7 > 7). While these operators do not commute, we can define the 
symmetric and antisymmetric subspaces to be their simultaneous +1 (resp. —1) eigenspaces: 


Sym™ V = {|p) € V®% : FY |p) = |b) Vi A Fj} (9a) 
Anti’ V = {|v) €V®% : FY) = |W) vi F 5} (9b) 


The corresponding wavefunctions are those satisfying 


WF, «oy Fay 05 Fy ey Pw) = EWA, 0 Fy Bye FN) (10) 


To compute bases for the symmetric and antisymmetric subspaces, we need to repeat our exercise 
of defining the symmetric and antisymmetric projectors and then applying them to basis states. 
This will be more complicated than the N = 2 case. Define Sy to be the set of permutations of N 
objects, i.e. the set of 1-1 functions from {1,...,N} to itself. |Sjv| = N! since there for 7 € Sj 
there are N ways to choose m(1), N — 1 ways to choose 7(2) (i.e. any element of {1,...,.N} not 
equal to 7(1)) N — 2 ways to choose 7(3) and so on for 7(4),...,7(NV). For a permutation 7 define 
the operator F’” to the map sending each state |a1) @---@|an) to |a,-1(1)) @ +++ @ |a_-1(y). One 
particularly simple example of a permutation is a transposition, which exchanges two positions and 
leaves the other positions untouched. The F” operators above are the operators corresponding to 
transpositions. 

One useful fact about Sy is that it is a group, meaning that it contains the identity permutation 
(denoted e) and is closed under multiplication and inverse. In other words if 7,v € Sy then 
applying v then 7 is another permutation (denoted mv) and there exists a permutation 7! satisfying 
an! =~! =e. Additionally F™ is a representation meaning that F™” = F™F”. Verifying these 
facts is a useful exercise. One consequence is that the sets {7:7 € Sw} and {va: 7 € Sy} are the 
same. 

One can use these to show that the symmetric and antisymmetric projectors are given by 


1 - 1 rT 
Psym = a Ss" F and jae ii Ss" sen(7)F™. (11) 


TESN “ nESN 


To prove this, we need to argue that ImPsym C Sym V and that if |) € Sym% V then Pyym|) = 
|W). For the former, an arbitrary element of it can be written as 


Peym|h) = a >> Fd). 


TESN 


Applying F” yields 


F’ Payal) = 5g Do F’F*W) =a Do PW) = 5 Do PW) = Payal. 


TESN TES N TES N 


The third equality used the fact that 7 +> v7 is a 1-1 map. Next suppose that |) € Sym’ V. 


Then i 
Pamlé) = 74 So FTW) = aq DY) =), 


TES N “TES 


where the second equality used the fact that F7|q) = |w) for all |~) € Sym% V. 

The argument for the antisymmetric projector is similar, but we first need to define sgn(z), 
which is called the sign of a permutation. It is defined to be 1 if a can be written as a product 
of an even number of transpositions or -1 if 7 can be written as a product of an odd number of 
transpositions. For example, for N = 3, sgn(z) = 1 if a is the identity permutation, or a cycle 
of length 3, such as 1 + 2 > 3 — 1; in fact, sgn(7) = €p,n923, Where €;, is familiar Levi-Civita 
symbol. It is not clear that sgn(7) is well-defined: 7 can be written as a product of transpositions 
in an infinite number of ways, and what if some of them involve an even number of transpositions 
and some involve an odd number? It turns out that this never happens. To prove this, an alternate 
definition of sgn(7) can shown to be 


sen(7) = det (Ssi0¢e |i) ( al). (12) 


4 


which suffers from no such ambiguity. As an example of (12), the permutation which swaps 1 and 


2 has sign —1, which equals det i: . Similarly any single transposition has sign —1 according 
1 0 
to (12) and the multiplication rule for determinants (det(AB) = det(A)det(B)) can be used to 
show that these two definitions of sgn(7) are equivalent. Like the determinant, the sgn function 
obeys sgn(v7) = sgn(v)sgn(7) for any permutations v,7. This can be used to prove that Panti is 
the projector onto the antisymmetric subspace, using an argument similar to the one used for Psym 
and the symmetric subspace. 
As a result, we can write a basis for Sym V consisting of the states 


yee ow) =N SS loraay) ® ++ lene): (13) 
TESN 
Here N is a normalization term that is equal to N!~!/2 if the ay,..., ay are all distinct, equal to 


1 if they are all the same, and in general will be somewhere between these two extremes. Similarly 
Anti’ V has a basis consisting of the states 


F 1 
t = 
Haan) = apy De SemtaMag(y) ® laacw (14) 
TESN 
Since these are always zero if any of the a;’s are equal, the normalization is always Tat 
For spatial wavefunctions, there is a useful formula for (7,...,7v|v2"" .,,) derived by John 
Slater in 1929. First we recall a formula for the determinant of a matrix 
det(A) = S© sgn(m) Ay n(1)Aa,n(2) --» Anyr(n): (15) 
TES N 
Using this and the notation Wq(7) = (7a), it is straightforward to show that 
Won (71) Yor (Fn) 
(Fi... Five ay) = det (16) 


Yan (71) aed Yan (Tw) 
This is called a Slater determinant. For example when N = 2, the wavefunction is of the form 


Vor (71) Pas (72) = Yar (71) Way (72) 
a : 


(17) 


1.3. Non-interacting particles 


So far we have described only the state spaces. Now we begin to consider Hamiltonians. If H is 
a single-particle Hamiltonian (i.e. a Hermitian operator on V) then define H; to be H acting on 
system i (in an N-particle system): 


H, = 11 @ H@ eX, (18) 


If we have N particles each experiencing Hamiltonian H (e.g. N spins in the same magnetic field) 
then the total Hamiltonian is 


Hay Be (19) 


Suppose that the eigenvalues and eigenstates of H are given by 
Ha) = Eq\a) 


with Eg < FE, <.... Then what is the spectrum of H? There are three cases. 


1.3.1 Distinguishable particles 


The overall space is V@% which has a basis consisting of all states |a,) ®---@|ay) that are tensor 
products of single-particle energy eigenstates. Since 


Hj|a1) @ +++ @ lan) = Eg,|01) ® ++: @ lan), 


it follows that 

H\ai) @ +--+ @ lan) = (EBay +... + Eay)|01) ® +++ @ lan). (20) 
Thus {|a,) ®--- @ Jay)} is an orthonormal basis of eigenstates of H. The ground state is |0)®%, 
which has energy No. The first excited subspace is N-fold degenerate and consists of all of the 
states of the form |1,0,0,...,0), |0,1,0,...,0), etc. It has energy (N —1)E 9+ F,. A general energy 
level with all a; distinct similarly has degeneracy N!, even aside from the possibility of obtaining 
the same total energy by adding up different collections of E,’s. 


1.3.2. Bosons 


sym 


The ground state is still |0)®, or equivalently l2.0.....9): and the ground state energy is still N Ep. 
Again the energy of the first excited state is (NV —1)Eo + £1. But now there is no degeneracy. The 
first excited state is 
wr) 2,0, 0) 11050): 10, 1,0, 0150) 1 10)0,0).4..51)° 
tek ccletae: JN 


We could write or. 0) or any other subscript with N — 1 0’s and one 1, but these all refer to 
exactly the same state. Similarly all the same energies Ey, +...+ Eq, still exist in the spectrum of 
H restricted to Sym V, but the degeneracy of up to N! is now gone. Specifically state ee eae 
has energy Eq, +...+ Ea,- Since these are a basis for Sym’ V we know we have thus accounted 


for the entire spectrum. 


(21) 


1.3.3 Fermions 


Now things are substantially different. The state |0)®% is no longer legal, and so the ground state 
energy is going to be different. If we use the basis given by {|72™" }\, we see that this is already 


Ree 
an eigenbasis with state [2 ,) having energy Ey, +...+ Eoy. So far this is the same as in 
the boson case except that we must now have all the a; distinct. Without loss of generality we can 
assume ay < ag <...< ay. Asa result, a; > %7—1 and the energy is > Eo + £, +...+ En_. 
This energy is achieved by the state wer n_1) Which must be the unique ground state. The first 
excited state is an. v2.) which has energy Ey + £,+...+ Ey—9+ Ey. Both of these are non- 
degenerate unless there are degeneracies in the single-particle spectrum. One way to interpret the 
first excited state is that we have added a particle with state |) and a “hole” (meaning the absence 
of a particle) with state |N — 1). Higher excited states can be found by moving the particle to 
higher energies (e.g. [oan w—2,n41))s moving the hole to lower energies (e.g. oan w—3,.N-1,N)) 
or creating additional particle-hole paris (e.g. War n—3,N,N41)): Holes are studied in solid-state 
physics, and were the way that Dirac originally explained positrons (although this explanation has 


now been superseded by modern field theory). 


1.4 Non-zero temperature 


Let us calculate the thermal state e~°" /Z for N non-interacting fermions or bosons. 

The eigenstates can be labeled by occupation numbers no,7n1,72,... where n,; is the number of 
states with energy E;. For fermions n; can be 0 or 1, while for bosons, n; can be any nonnegative 
integer. 

Here it is easiest to work with the grand canonical ensemble. In this, the probability of a 
microstate with energy E and N particles is proportional to e~°@-N) where 8 = 1 /kpT and 
pt is the chemical potential. We can think of this as resulting from the system being in thermal 
contact with a reservoir containing many particles each with energy yu. Alternatively, we can 
maximize entropy subject to energy and particle number constraints and then 6, emerge as 
Lagrange multipliers. 

For us, the benefit will be that the probability distribution factorizes. We find that the proba- 


bility of observing occupation numbers ng, 71,... is 
= ps He _ Ate = 
Pring, m1,...] = exp(—8 D7; ni(Ei — )) = exp(—8 D0, ni( Ki L)) (22) 


where in the sum each n/, ranges over 0,1 (for fermions) or over all nonnegative integers (for bosons). 
Either way this factorizes as 


eni(Ei-H) 
Pr[no,m,--.] =] ] = PE (23) 


i>0 Cn} 


In other words, each occupation number is an independent random variable. 
For fermions this results in the Fermi-Dirac distribution. 


1 eB Ei- bh) 


=e oS Seer 

while for bosons we obtain the Bose-Einstein distribution. 
Prin] = e Pri) (1 — e P(E: -H)) (25) 

1 
(”i) = a 1 (26) 
Note that for bosons we require 4 < FE; but for fermions this is not necessary. 
The Fermi-Dirac occupation number can be rewritten as 

(ni) = a (27) 

teh ae ga): 


As 6 — oo this approaches a step function which is = 1 for EF; < u and = 0 for E; > yw. Thus in 
the zero-temperature limit we will fill levels with energy up to some limit ~ and no levels above 
this energy. 


1.5 Composite particles 


Usually particles have multiple attributes with distinct degrees of freedom, e.g. their positions and 
their spins. These are combined by tensor product, so we can write the state of a single electron 
as |Welectron) = |Wspatial) ® |Wspin). This division is often somewhat arbitrary, as in the case of 


electrons in hydrogen-like atoms, where the state could be written either as |n,1,m,s) or (dividing 
into spatial and spin parts) as |n,1,m) ® |s). 
More generally, suppose the state space of a single particle is V ® W. Then the state of N 


distinguishable particles is 
(V Q wyer ~ Yon @ wen. (28) 


This isomorphism is proved by simply rearranging the terms in the tensor product V@W ®V ® 
W ®---®V@W so that all the V’s precede all the W’s. For example, for N distinguishable 
particles in a —1/r potential (e.g. imagine a proton surrounded by an electron, a muon, a tau 
particle, and, well, let’s just take N to be 3) we could just as well use the basis 
{|n1, 41,771, $1,-..,2n, ln, mn, $n) } (29) 
corresponding to (V @ W)®% or the basis 
{|n1,l1,m1,...,2N, ln, my) ® |s1,...,8n)} (30) 


corresponding to V@% @ W®, 
For fermions and bosons, the situation is not quite so simple since Sym (V QW) ¥ Sym* V @ 
Sym W and Anti’ (V @ W) # Anti’ V @ Anti’ W. 
Let us focus for now on the case of N = 2. Then 
Anti?(V @W) = {lv) €EV@W@VEW: Fl) = —lp)}, (31) 


F'!#:34 is the permutation that swaps positions 1,2 with positions 3,4. That is 


where 
F341 1 a2, 03, O14) = |a3, 4, 1, 2). (32) 


What if we would like to understand Anti?(V @ W) in terms of the symmetric and antisymmetric 
subspaces of V®? and W®?? Then it will be convenient to rearrange (31) and write (with some 
small abuse of notation) 


Anti?(V @W) = {lv) €eV@VEW OW: FMA) = —lp)}, (33) 
where F324 is the permutation that swaps positions 1,3 with positions 2,4, meaning 


F241, 2,03, a4) = |a, 01, 04, 03). (34) 


Since F!*?4 squared is the identity, its eigenvalues are again +1. We can also write Fl?4 = 
Fl? 734 | where 


Flay, a2, 03,04) = |a2, 01,03, 04). (35a) 
F\oy 02,03, 4) = |01, 02, 04, 03). (35b) 


Since F!? and F?4 commute, the eigenvalues of their product are simply the product of their 
eigenvalues. The joint eigenspaces are as follows 


Fri2:s4 Flue F34 


+1 +1 +1 
—1 +1 -1 
—1 —-1 +1 
+1 -1 -l 


Thus the —1 eigenspace of F!*'?4 contains states in the +1 eigenspace of F!'? and the —1 eigenspace 


of F?*4, It also contains states in the —1 eigenspace of F!'? and the +1 eigenspace of F°'*, as well 
as superpositions states in these two spaces. Putting this together we have 


Anti?(V @ W) & (Sym? V @ Anti? W) © (Anti? V @ Sym? W). (36) 
Similarly the symmetric subspace of two copies of V ® W is 
Sym?(V @ W) & (Sym? V @ Sym? W) @ (Anti? V @ Anti? W). (37) 


As an application, a pair of electrons must have either a symmetric spatial wavefunction and an 
antisymmetric spin wavefunction (i.e. singlet), or vice versa, an antisymmetric spatial wavefunction 
and a symmetric spin wavefunction. This can lead to an effective spin-spin interaction, and is 
responsible for the phenomenon of ferromagnetism, which you will explore on your pset. 


1.6 Emergence of distinguishability 


Given that all types of particles are in fact either bosons or fermions, why do we talk about 
distinguishable particles? Do they ever occur in nature? It would seem that they do, since if we 
have N spatially well-localized electrons, we can treat their spins as distinguishable. In other words, 
we say that the wavefunction is 


|) _ S- Csi cca Sipe 5 SN) (38) 


$1-5,8NE{+,—} 


with no constraints on the amplitudes cs,,...5, apart from the usual normalization condition 
> |és),....sv/7 = 1. A Hamiltonian that acts only on spin 2 (say) would be of the form 1@ H@I®N~?, 

Let us examine carefully how this could be realized physically. Assume the electrons are in a 
potential that traps them in positions that are far from each other. Denote the resulting spatial 
vectors by |1),|2),..., |) corresponding to wavefunctions 71(7),..., (7). If we had one electron 
in position |1) with spin in state |s,), another electron in position |2) with spin in state |s2), and 
so on, then the overall state would be 


Tm ee sgn(7)|(1)) ® |Sa(1)) ® |t(2)) @ |Sa(2)) @ ++ ® |w(N)) ® | Saxyy)- (39) 
VN 
A general superposition of states of this form with different values of s,,...,5, would be 


M= So ey, “wa > sgn(m)|r(1)) ®|sx1y) @ (2) @|8n(2)) @- + @|7(N)) @|Sqcvy)- 
$1,..,5n €{+,—} TESN 
(40) 
This wavefunction is manifestly antisymmetric under exchanges that swap the spatial and spin 
parts together. 

To see how (38) emerges from (40) consider an experiment that would try to apply a Hamiltonian 
to, say, the spin of the 2nd particle. When we say “the second particle’ what we mean is “the 
particle whose position in space corresponds to the wavefunction ~2(7).” For example, if we want 
to apply a magnetic field that affects only this particle, we would apply a localized magnetic field 
that is nonzero only in the region where w2(7) is nonzero and (7) = 0 for i 2. (Here we use the 
assumption that the electrons are well separated.) Suppose this field is B,Z in this region and zero 
elsewhere. This field would correspond to a single-particle Hamiltonian of the form |2) (2| @ w,S, 


for w, = —eB,, where the |2)(2| means that it affects only the part of the wavefunction in spatial 
state |2). The resulting N-particle Hamiltonian is 


N 
H = S-(1@ 1) @ (|2)(2| @w2Sz) @ (L@ NE, (41) 


i=1 


Observe here that the tensor position has no physical significance, but that different particles are 
effectively labeled by their spatial positions. Imagine a law-school professor who calls on students 
by seat number instead of by name. 

A similar argument could apply to N bosons. In each case, the states involved are not completely 
general states of N fermions/bosons. Returning to the case of electrons, we are considering states 
with exactly one electron per site. But states also exist with zero or two electrons in some sites 
(or superpositions thereof). If we have apply a magnetic field to a site where there is no electron, 
or bring a measuring device (say a coil to detect a changing magnetic field) nearby, then nothing 
will happen. What if there are two electrons on a site? Then again nothing will happen, but for 
a less obvious reason. This time is it because the spin singlet state is invariant under collective 
rotation, and will not be affected by a magnetic field. Overall it is possible to observe behavior that 
is more complicated than in the model of N distinguishable spins. Spatial position can be used to 
distinguish particles, but it does not have to in every case. 


2 Degenerate Fermi gas 


2.1 Electrons in a box 


Consider N electrons in a box of size L x L x L with periodic boundary conditions. (Griffiths 
discusses hard-wall boundary conditions and it is a good exercise to check that both yield the same 
answer.) Ignore interactions between electrons. Then the Hamiltonian is 


No 
Hay (42) 


We will see that even without interactions, a good deal of interesting physics will result simply from 
the Pauli exclusion principle. This is because the N-electron ground state will occupy the lowest 
N levels of the single-electron Hamiltonian p?/2m. 

To analyze this, we start with the one-particle states. The eigenstates and energies are given 
by 


eikeF sy On F 
Bg=5_* -3 (=) ae (44) 


The allowed values of k form a lattice with a spacing of 27/L between adjacent points. However, 
we will work in the limit where L and WN are large; e.g. in macroscopic objects N will be on the 
order of 102%. In this limit, we can neglect the details of the lattice and say instead that the there 
is one allowed wavevector per (27/L)? volume in k-space (or two if we count spin). It is left as an 
exercise to make this intuition precise. 

Because of the spin degree of freedom, N electrons in their ground state will fill up the lowest 
N/2 energies, corresponding to the k with the lowest values of k?. These k vectors are contained in 


10 


a sphere of some radius which we call kp (aka the Fermi wave vector). Since each wavevector can 
be thought of as taking up “volume” (27/L)? in k-space, we obtain the following equation for kp: 


4 ENS 
2n7)1/3 1/3 
bp = STAT = Gan? (*) =| (3n?n)'/8 (45b) 


The fact that kp (in this calculation) depends only on the density n = N/V reflects the principle 
that kp is independent of the shape of the material. 

From this calculation we can immediately derive many physically relevant quantities. The 
chemical potential is the energy associated with adding one electron to the system, ie. pp = 
Egs(N +1) — Egs(N). Since this new electron would have momentum ~ hkp, its energy (which is 
also the chemical potential) is 
_ RR 


E ; 
# 2m 


(46) 


This energy is also called the Fermi energy. 
We can also calculate the ground-state energy. Here is a way to do this that is simple enough 
that you should be able to recreate it in your head. First recalculate the number of particles in 


terms of kp as 

N =f 2- (=) (4nk*dk) = ck*dk = c=, (47) 
0 27 0) 3 

for some constant c. Similarly write Eg. as 


ne [oa WR de op Be he TRA Be 58 
Ee 2-(—] (4nk*dk)—— = — eh F=—*N_==NEp. (48 
: [ (=) Sone a) 2m 2m Jo aoe am” 5 Im 5 5. co 


In hindsight we should have guessed that E,, would be some constant times NEp. There are N 


electrons, each with energy somewhere between 0 and Er depending on their position within the 
[ k+dk 
[kdk* 
Now consider the volume-dependence of the energy. The ground-state energy is proportional 
to Er, which in turn scales like n?/*, or equivalently, like V~2/3. Thus there is pressure even from 


non-interacting fermions. This pressure has many equivalent forms 


30Er 3 (_2\ Er _2 _ 2B gs _ (Bn2)"F* 0/3 
aE MV 8 av 5m8/3 


sphere. So the only nontrivial calculation was to get the constant 3, which boils down to 


_OBg|  _ ny 


P= = 


(49) 


where in this last version we have defined the (mass) density p = mn. 

Let’s plug in some numbers for a realistic system. In copper, there is one free electron per 
atom. The density is about 6g/cm? and the atomic weight is 63g/mole, which corresponds to 
n & 1078/em3. An electron has mass 511keV (working in units where h = 1 and c = 1). In these 
units lem = 5-104eV—!. Putting this together we get Ep = O(1)eV. 

This is > kpT for room-temperature T,, justifying the assumption that the state is close to the 
ground state as room temperature, and corresponds to a ur = \/2E f/m that is < c, justifying a 
non-relativistic approximation. 

Can we justify neglecting the Coulomb interaction? See the pset. 


11 


The Drude-Sommerfeld model. This model of a metal is simple but is already enough to 
derive the Drude-Sommerfeld model that explains thermal and electrical conductivity, heat capacity 
and thermal emission. On the other hand, adding a periodic potential (see below) will somewhat 
complicate this picture. 

I will briefly describe the Drude-Sommerfeld model first. One goal is to explain Ohm’s Law 


J=o8, (50) 


where E is the electric field, J is current and o is conductivity. (Here o will be a scalar but more 
generally it could be a 3 x 3 matrix.) Suppose that electrons accelerate ballistically for a typical 
time 7 before a collision which randomizes their velocity. In the absence of a collision they will 
accelerate according to 

mv = eE, (51) 
so their mean velocity will be 

tg = —Er. (52) 

m 

Here we write vg to mean the “drift” velocity of an electron. The actual velocity will include 
thermal motion as well but this averages to zero. The net current is nevg, where n is the electron 
density. Putting this together we obtain 


a . (53) 


One unsatisfactory feature of this model is the presence of the phenomenological constant T. 
On the other hand, a similar calculation (omitted) can express thermal conductivity « in terms of 
T, and predicts that the “Lorentz number” 4, should be a universal constant. Modeling electrons 
as a Classical gas predicts that this number should be © 1.1 x 10-9 watt ohm and modeling them 
as a Fermi gas predicts * 2.44 x 10-8 att obm In real metals, this number is much closer to the 
Fermi gas prediction, which provides some support for the Drude-Sommerfeld theory. 

There are some other aspects of the theory which are clearly too simplistic. The mean free 
path ¢ = vp is hundreds of times larger than the spacing between atoms. Why aren’t there more 
frequent collisions with the atomic lattice or the other electrons? There are qualitative problems 
as well. While (53) does not depend on the sign of the charge carriers, the Hall effect (discussed 
later) does. Observations show that most materials have negative charge carriers but some have 
positive charge carriers. Another strange empirical fact is that some crystals are conductors and 
others are semiconductors or insulators; overall, resistivity varies by more than a factor of 107°. 
The conductance is also temperature dependent, but not always in the same direction: raising 
temperature will reduce the conductivity of metals but increase it for semiconductors such as 
silicon and germanium. 

Explaining these facts will require understanding periodic potentials, which we will return to 
in Section 2.3. 


2.2. White dwarves 
Our sun is powered by fusion, primarily via the p-p process: 
4H > *He + 2et + 2e7 + 2v. + heat. 


This creates thermal pressure outward which balances the inward gravitational pressure. When 
the hydrogen runs out, helium can further fuse and form carbon and oxygen. In larger stars 


12 


this will continue producing until iron is formed (heavier elements are also produced but do not 
create more energy), but eventually fusion will stop. At this point the heat will be radiated away, 
the temperature will drop and gravity will cause the star to dramatically shrink. This will not 
necessarily end in a black hole due to the degeneracy pressure of the electrons. 

We will model the star by a collection of noninteracting nonrelativistic electrons in their ground 
state. There are a lot of assumptions here. The ground-state assumption is justified because 
photons will carry away most of the energy of the star. The non-interacting assumption will be 
discussed on the pset. We consider only electrons and not nuclei because degeneracy pressure scales 
with Ep «x 1/m, although we will revisit this, as well as the nonrelativistic assumption, later in 
the lecture. We assume also a uniform density which is not really true, but does not change the 
qualitative picture. 

For a given stellar mass M, we would like to find the radius R, that balances the gravitational 
pressure with the electron degeneracy pressure. We can find this by computing the total energy 
Exot and setting dE,o,/dR = 0. The resulting value of R, is a stable point if d?E,o/dR? < 0. 

The free parameters are: 


e N, the number of nucleons (protons and neutrons). These each have mass roughly mp © 
10%eV/c? and carry most of the mass of the star, so that M, ~ Nmp. 


e f, the fraction of electrons per nucleon. Charge balance means that there are N. = fN 
electrons, fN protons and (1 — f)N neutrons, and in particular that 0 < f < 1. 


Let’s calculate the energy contribution from gravity. We will make use of the density p = a 
an 
EG losed by radi t dist 
n(mass enclose radius r)(mass at distance r 
Fomin) = [" 2x! y radius 7)( Dee 
(0) iA 
Heng 
=) gous —ar p Arr” pdr 
0 r 3 
3. MG 
— ==G * 
5 NR 
_ KN? 
=— p> 


where in the last step we have defined the universal constant k = 3G nme. 
On the other hand the energy from electron degeneracy is 


3 
Edegen(R) = = Egs i 5 Ne eLip(R) 


- Re 4 4 


rer (3 2/3\ 75/3 _ ANS: 
cS See meR?’ 


where again ) is a universal constant. For f = 0.6 (as is the case for our sun) we have \ ~ 1.1h?. 
Combining these we have 


2m, 


N?/8 2 
ae, (54) 


Ei) = ie OR 


13 


Exo (arbitrary units) 


25 ‘an 1 4 1 n 4 n 4 1 
0 0.5 1 1.5 25 3 3.5 4 45 5 
R, 


2 
R (arbitrary units) 


Figure 1: Plot of (54) showing energy from gravity and degeneracy pressure as a function of radius. 


This is plotted in Fig. 1. 
Setting dEyo¢(R)/dR = 0 we find 


a 
R, = 2A y-1/3| (55) 


Mek 


On the pset you will plug in numbers showing that for M, = Mgyy the radius is R, © Rearth. Thus, 
this predicts an enormous but finite density. 

One strange feature of (55) is that as N increases (equiv. as M, increases), the radius R,. 
decreases. This means as the star gets larger it approches infinite density. This clearly cannot be 
valid indefinitely. 

Let’s revisit the non-relativistic assumption. This is valid if 

vp _hkp  fint/® ANY ANY Gn 


m2 
iS = ~ ~ m2 N28 = —? NPS, 
c Mel MeC mecR MeC—A~ N-1/3 Ac Mp] 
e 


In the last step we have introduced the Planck mass mp; which is the unique mass scale associated 
with the fundamental constants Gy,f,c. Thus the critical value of N at which the non-relativistic 


assumption breaks down is 
3 
Mp 
Nerit © (22) we 10°", 
Mp 
2 
The critical mass is Merit = Nerit Mp © 10®%eV/c?. By contrast Mgun © 2-10°°kg -6- 10> tee ~ 


10° eV/c? which is right at the threshold where the non-relativistic assumption breaks down. 


2.2.1. A relativistic free electron gas 


We follow the same approach as before but instead of E = h?k?/2m we have 


E = \/m2ct + hk2e2 & hielk. 


14 


Ey. (arbitrary units) 
Exo. (arbitrary units 


R (arbitrary units) . R (arbitrary units) 
expands until non-relativistic collapses to neutron star or black hole 


Figure 2: Plot of (56) showing the total energy from gravity and degeneracy pressure of a relativistic 


free electron gas. 


The first expression is the exact energy valid for all k and the latter is the ultra-relativistic limit 
which is relevant when h|k| >> mec. In this case the lowest energy states are still those with |k| < kp 
for some threshold kr, but the modified form of Ef means we now have 


kp L 3 
Lae = fi 2 (=) Ank?dk heck 
0 21 
H+ 


eo 
Vic, 4 
ae" 
_yJ N4/3 
~ " yi/3 


where Kk! = 3 aye f*/3hc. Now the total energy is 


KN2 K/N4/3 OA 
Fiot = =. 56 
tot R ef R R (56) 


The situation is now much simpler than in the non-relativistic case. If A < 0 then the star collapses 
and if A > 0 then the star expands until the electrons are no longer ultra-relativistic. See Fig. 2. 
The critical value of N at which A = 0 occurs when KN? = x! N4/3, Rearranging we find that 


this is at N, = (x’/«)°/?. The corresponding mass turns out (using f = 0.6 and Mgun = mbt) to be 
M. = m,N;, & 1.4Mgum. This bound is called the Chandrasekhar limit. (Actually the true bound 
drops some of the simplifying assumptions, such as constant density, but it is not far off from what 
we have estimated.) 

We now can predict the fate of our sun: it will become a white dwarf. What happens when 
A <0? In this case a white dwarf will collapse but not necessarily to a black hole. At high densities 
the reaction e~ + pt ++ n+v will convert all the charged particles into neutrons. Neutrons are 
fermions and have their own degeneracy pressure. The analogue of a white dwarf is a neutron star, 
which is supported by this neutron degeneracy pressure. We now repeat the above calculation but 
with f = 1 and with me replaced by my. It turns out that neutron stars are stable up to masses 
of roughly 3.0Mgun (although this number is fairly uncertain in part because we don’t know the 


15 


structure of matter in a neutron star; it may be that the quarks and gluons combine into more 


exotic nuclear matter). Above 3.0Mgun there is no further way to prevent collapse into a black 
hole. 


M 
black 
hole 
3.0Msun 

neutron star 
1.4Msun 

white dwarf 

0 


Besides our sun, some representative stars are Alpha Centauri A at 1.1 solar masses, destined 
to be a white dwarf, and Sirius A at 2.0 solar masses, destined to be a neutron star. 


2.3 Electrons in a periodic potential 


This lecture will explore how degenerate fermions can explain electrical properties of solids. In a 
solid, atomic nuclei are packed closely together in (roughly) fixed positions while some electrons 
are localized near these nuclei and some can be delocalized and move throughout the solid. We 
model this by grouping together the nuclei and localized electrons as a static potential V(Z), while 
assuming the delocalized electrons are subject to this potential but do not interact with each other. 
In other words, we add a potential but still do not consider interactions. This model is simple but 
already contains nontrivial physics. The resulting Hamiltonian is 


YB 
H=S- a + V (Gi). 
i=1 


We will see that this gives rise to band structure, which in turn can explain insulators, conductors 
and semiconductors all from the same underlying physics. 


2.3.1 Bloch’s theorem 


Suppose at first that we have a single electron in a 1-d lattice with ions spaced regularly at distance 
a. The resulting potential is periodic and satisfies 


V(z) = V(x+a) (57) 


for all x. We can express this equivalently in terms of the translation operator T,, defined by 
T(x) = (a +a). Since [T,, V] = 0 we have [T, H] = 0 and thus T,, and A can be simultaneously 
diagonalized. Bloch’s theorem essentially states that we can take eigenstates of H to be also 
eigenstates of Ty. 

To see what this means let’s look more carefully at Ti,. We have seen this in 8.05 already: 


T, = exp (4°) = exp (o>) : (58) 


16 


Our boundary conditions ensure that p is Hermitian, and thus 7, is unitary. Therefore its eigenval- 
ues are of the form e’® for a € R. By convention we write the eigenvalues as e“** for some k € R. 
The resulting eigenstates ~,4(x) thus satisfy 


Tatn,b(2) io Un k(t ic a) = ee, (2). (59) 


Here n is a label for any additional degeneracy in the eigenstates of H. 

We can also write Unx(v) = ec un g(x) with Ung(v) = Ung(a +a). This form of Bloch’s 
theorem is more conventionally used. 

The probability density |wn4(2)|? = |Un,x(x)|? is periodic but delocalized. In that sense in 
resembles plane waves, and implies that the electrons are generally free to move around, despite 
the presence of the ionic potential. 

Range of k values? By definition if we add an integer multiple of 27 to k then the phase e’** 
is the same. So we can WLOG restrict k to the “Brillouin zone” —7 <k < 7. 

Meaning of k. The value k is referred to as a “crystal momentum,” and even though it is 
only defined modulo 27/a, it has some momentum-like properties. To see this, first look at the 
eigenvalue equation satisfied by un ,. 


2 2 
(En —V(2))e*ting(t) = Petia g(a) = PEO a, (a, (60) 
Rearranging we have 
(p + hk)? 
( er | tbr, k) k|Un,k) (61) 
e——.__S > 
Ar 
By the Hellmann-Feynman theorem, 
1dEnr dH; 
aah (tn 6] pp Unde) (62) 
+ hk 
= (tin,k|-——— lttn ) (63) 
= w ) 4 
= nel ltn,k) = (64) 


If we interpret this oe quantity as the velocity and define wy = En,./h, then we find that the 


velocity is equal to 4 jet . This is the usual expression for the group velocity of a wave. 
The 3-d case. We will not explore this in detail here, but here is a very rough treatment. 


An k(E @) for Unp(£) = 


TT 


Un,k(& + ae;) for i = 1,2,3. The resulting Brillouin zone is defined oy ky |= ayo 
More generally suppose we have a lattice that is periodic under translation by @),d2,a@3. The 
theory of crystallographic groups includes a classification of possible values of G1, @2,43. The “re- 


ciprocal lattice vectors” bis bo, bs are defined by the relations 


If we have a cubic lattice with spacing a then we can write 7, glZ) = = elke 


Gj b; = 4: (65) 
This is equivalent to the matrix equation 
ay 
a2 bt bo b3 — 13 (66) 
a3 


17 


Similar arguments imply that k lives in a space that is periodic modulo translations by bi, bo, bs. 
For more details, see a solid-state physics class, like 8.231, or a textbook like Solid State Physics 
by Ashcroft and Mermin. 


The tight-binding model. On your pset you will consider some more general models, but for 
now let us consider a simple model that can be exactly solved. In the tight-binding model the 
potential consists of deep wells, each containing a single bound state with energy Eo. Let |n) 
denote the state of an electron trapped at « = na forn =...,—2,—1,0,1,2,.... Suppose there is 
also a small tunneling term between adjacent sites, so that the Hamiltonian is 


H= > EBo|n)(n| — Ajn +1)(n| — Aln = 1)(nf. 


nN=— CO 


This can be rewritten in terms of the translation operator T = S>, |n + 1)n as 


H = Eol — A(T +T"). (67) 
By Bloch’s theorem the eigenstates can be labeled by k, with 
Tb) = eM |x) (68) 
Abe) = Ex|vn) (69) 
From (67) we can calculate 
Ey, = Ey — A(e!*? + e***) = Ey — 2A cos(ka). (70) 


If A = 0 then we have an infinite number of states with degenerate energy equal to Eo. But when 
A #0 this broadens into a finite energy band E + 2A (see plot). 


Ex 
— 
+ Hyp +2A 
J Fis 
Eg — 2A 


Real solids are not infinite. Suppose there are N sites with periodic boundary conditions, 
i.e. let L = Na and suppose that w(0) = #(L). This implies that T”|q) = ~ and therefore for any 


ika 


eigenvalue e’*” we must have 


ew (71) 
Nka = 2rn for some integer n (72) 
27n 2a N N 
— — i <a 
k Nar" FSS (73) 


The energy levels are along the same band as before but now the allowed values of & are integer 
multiples of ar 


18 


Ex 


| 
a | 
813 


spacing = 
ia 


i/ 


individual 
energy 
levels 


+ Eo +2A 


+ Eo 


Eo — 2A 


Each one of these points corresponds to a delocalized state. 


Band structure. Now suppose each site can support multiple bound states, say with energies 
Eo < E, < Ey. Then tunneling opens up a band around each energy. If A is small enough then 
there will be a band gap between these. 


Kronig-Penney model Another model that can be exactly solved is a periodic array of delta 
functions. We skip this because Griffiths has a good treatment in section 5.3.2. 


Free electron. We now turn to a rather trivial example of a periodic potential, namely V = 0. 
This is periodic for any choice of a. Still it is instructive to apply Bloch’s theorem, which implies 
that eigenstates can be written in the form 


Dn,k(@) = en (2), (74) 
where —2 <k < = and unp(x) = Unyp(w@ +a). One such choice of un x(x) is eas Plugging this 
into (74) we obtain 


i(7=n+k)x he 2m ° 
Wn A(t) =e and En = ra. +k). (75) 


This corresponds to a separation of scales into high and low frequency; the index n describes the 
rapid oscillations that occur within a unit cell of size a and the crystal momentum k describes the 
long-wavelength behavior that can only be seen by looking across many different cells. 

If we did not use Bloch’s theorem at all then the allowed energies would simply look like the 
parabola E = p?/2m . Dividing momentum into k and n leads to a folded parabola; see Fig. 3. 


Nearly-free electrons. Now suppose that V is nonzero but very weak. If [T{,V] = 0 then V 
will be block diagonal when written in the eigenbasis of T,. Recall that the eigenvalues of T, are 


19 


Ex 


Figure 3: Band diagram for free electrons. 


e’*@ for k and integer multiple of zr If we let ¢ := 27a/L then there is a basis in which we have 

1 x ok Ox 
* Ok Ox 
1 ae ae 

ef * * Ox 

T= and V= 
a * Ok Ox 
ei? * * x 
e2id ‘ 


These blocks correspond to single values of k which are vertical lines in Fig. 3. Treat V as a 
perturbation. For a typical value of k the kinetic energy of these points is well-separated and so 
V does not significantly mix the free states. However, near +7/a, the kinetic energy term has a 
degeneracy, so there the addition of V will lead to a splitting, which will open up a gap between 
the bands (see figure drawn in class). 

Conductors, semiconductors and insulators. A band with N sites can hold 2N electrons, 
once we take spin into account. Suppose that there are M delocalized electrons in this band. If 
M < 2N then the band is partly filled. This means that it is possible for an electron at the edge of 
the filled region (see blackboard figure) to gain a small amount of momentum, perhaps in response 
to an applied electric field. In this situation we have a conductor. For example, in sodium there is 
1 free electron per atom, so the band is half full. A crystal can also be a conductor if bands overlap 
(something more likely in three dimensions) resulting in multiple partially full bands. 

Alternatively, suppose IM = 2N, so the band is completely full. If there is a large band gap 
then there is no way for an electron to absorb a small amount of energy and accelerate. In this 
case the material is an insulator. 

If the band gap is small then the material is a semiconductor. Call the band below the gap the 


20 


“valence band” and the band right above the gap the “conduction band.” At T’ = 0 the valence 
band is completely full and the conduction band is completely empty, but for J’ > 0 there are a few 
excited electrons in the conduction band and a few holes in the valence band. These are mobile 
and can carry current. The holes behave like particles with positive charge and negative mass. 

The Fermi surface can also be pushed up or down by “doping” with impurities that either 
contribute or accept electrons. Adding a “donor” like phosphorus to silicon adds localized electrons 
right below the conduction band. A small electric field is enough to move this into the conduction 
band. The resulting material is called an “n-type semiconductor.” On the other hand, aluminum 
has one few electron than silicon, and so is an “acceptor.” Adding aluminum will increase the 
number of holes in the valence band and reduce the number of conduction electrons, resulting 
in a “p-type semiconductor.” An interface between n-type and p-type semiconductors is called a 
p-n junction, and is used for diodes (including LEDs), solar cells, transistors and other electronic 
devices. 


3 Charged particles in a magnetic field 


3.1 The Pauli Hamiltonian 


Consider a particle with charge g and mass m. We will study its interactions with an electromagnetic 
field. To write down the Hamiltonian we will use not £ and B but instead the vector potential A 
and the scalar potential ¢. Recall that their relation is 


2 2 ». ce, Toa 
B=VxA and E=-V wee (76) 


We have seen this already for the electric field, where the contribution to the Hamiltonian is q¢(Z). 
The force from the magnetic field is velocity dependent (recall the classical EOM ma = q(E +2 x B) 
with ¢ = Z), so its contribution to the Hamiltonian cannot be as a potential term. 

To derive the correct quantum Hamiltonian we can start with the classical Hamiltonian and 
follow the prescription of canonical quantization (cf. "Supplementary Notes: Canonical Quantization and 
Application to the Quantum Mechanics of a Charged Particle in a Magnetic Field".) or we can start with the 
Dirac equation and consider the nonrelativistic limit. Another option is to add a term dz : 
A to the Lagrangian. Either way the magnetic field turns out to enter the Hamiltonian by 
replacing the kinetic energy term with >. (p— 4 A(#))?. We thus obtain the Pauli Hamiltonian 
(neglecting an additional spin term): 


en - 
H = — (p-44@)) +q9(@). (77) 
There are a few subtle features of (77). First, it contains a massive redundancy in that we are free 
to choose an arbitrary gauge for A and ¢ without changing the physics. Specifically if we replace 
A, @¢ with 
a ny 1 
A'=A+Vf and f= 2, (78) 
c 


then all observable quantities will remain the same. This gauge-invariance will be explored more 
on your pset. For now observe that plugging (78) into (76) leaves E,B unchanged. 

Also observe there are now two things we might call momentum: the original p and the term 
appearing in the Hamiltonian mv = p— 1 A(z). The operator j still satisfies [p;, x;] = —ihd;; and is 
called the generalized mometum. By contrast, mv is called the kinetic momentum. We will see on 
the pset that expectation values of one of these is gauge-invariant; this one can thus be physically 


21 


observed. (Sometimes an observable is said to be gauge invariant; this generally means not that 
the operator is gauge invariant but that its expectation values are. More concretely if O’ is the 
transformed version of O then we say O is gauge invariant if (¢"|O’|~w’) = (W|O|y).) 

Remark: This gauge freedom will appear many times in the coming lectures and often leads 
to seemingly strange results. It is worth remembering that we are already used to a simpler form of 
gauge invariance: that of replacing H(t) with H(t)+f(t)I, and |w(t)) with exp(—7 ei f (t’)dt' /h)|(t)). 
This change clearly describes the same physics and we have (often implicitly) restricted our attention 
to gauge-invariant observables. Specifically, we recognize that energies are arbitrary (here “non- 
gauge-invariant”) while differences in energies are physically observable (i.e. “gauge-invariant” ). 
This also points to the difficulties in formulating an analogue of the Schrédinger equation in 
which there are no redundant degrees of freedom, such as the overall phase. If we tried to 
express Hamiltonians in terms of energy differences rather than energy levels then the differen- 
tial equation would involve terms that were sums of many of these differences (e.g. Ey — Ey = 
(E4 — E3) + (E3 — E2) + (E2 — F1)), thus taking on a “non-local” character analogous to what 
would happen if we tried to express (77) in terms of E, B instead of A, ¢. 

As a sanity check on (77) we will show that it reproduces the right classical equations of motion. 
Recall that Hamilton’s equations of motions are 


L;= om and 1, = on 
i= Op; i= aa 
First we calculate 
OH 1 : am 
ig = = —(n,- 44) = pame+ th, (79) 
Op, m c Cc 


obtaining a generalized momentum p. Next we should remember that — ~(£) depend on position 


so that ay wm . 
. on; s “ (r-¢ Ai i) Ox; “2a 75 a 182; (80) 


To evaluate the LHS, we apply d/dt to (79) and obtain 


; d go gl .  @g {| OA; OA; . 
== pe oAs) =m = 
Pp Ti (mai + ; ) Mxr;+ . 


(81) 
Combining this with (80) and rearranging we obtain 
mis =a ( Ox; an) +! 4 (Ge) i?) 


To simplify the last term, observe that, for fixed 2, 7, 


OA; ae 
an. =. 2 ege(V X A)e = Ss" eigh Pr 
2 k 


and thus 


. | OA . OA; . 4, 3 
=e G& - i) = > ight i Br = (€x Bi. 


J ik 


We can now rewrite (82) in vector notation as 


aq) Véo=A| 42 Pe 8. (83) 
Cc 


This recovers the familiar Lorentz force law. Phew! 


3.2 Landau levels 


For the rest of this section we consider particles with charge gq and mass m confined to the zx-y 
plane with E = 0 and B = Bz = (0,0, B). 
The classical equations of motion are 


me = 1#xB 
Cc 


r\  qB y 
y a MC —7 
This corresponds to circular motion in the x-y plane with frequency wy, = a8 called the Larmor 


frequency. Circular motion is periodic and we might expect the quantum system to behave in part 
like a harmonic oscillator. 

To solve the quantum case we need to choose a gauge. Here we face the sometimes conflicting 
priorities of making symmetries manifest and simplifying our calculations. We will choose the 
“Landau gauge” to make things simple, namely 


=> 


Another variant of the Landau gauge is A= (0, Bx,0). On the pset you will also explore the 
“symmetric gauge,” defined to be A= (—4By, 5 Bx, 0). It is nontrivial to show that these result 
in the same physics, but this too will be partially explored on the pset. 

In the Landau gauge we have 


1 qB\*? 14 

HA =~ a —p.. 85 

area (v.+ - v) +5 Py (85) 

(We might also add a p2/2m term if we do not assume the particles are confined in the x-y plane. 
This degree of freedom is in any case independent of the others and can be ignored.) 

To diagonalize (85) the trick is to realize that [H,p,] = 0. Thus all eigenstates of H are also 

eigenstates of pz. Let us restrict to the hk, eigenspace of pz. Denote this restriction by Hz. Then 


2 2 2 2 2 
Dy 1 qB y , L qB —hkzc 
a = — . 86 
* 2m 2m G 6 s) 2m ap 2" \ me 2 qB o 


This is just a harmonic oscillator! The frequency is 


= 
t 


Ts 
MC 


23 


(ie. the Larmor frequency) and the center of the oscillations is offset from the origin by 


—hkzc 9 
= = —I5k 
Yo qB Oh 
In the last step we have defined Jg to be characteristic length scale of the harmonic oscillator, 
Le. lg = Po 
.€. lg mee qB° 


We can now diagonalize H from (85). The eigenstates are labeled by kz and ny, and have 


energies and wavefunctions given by 
Ekeny = hurt (ny + 1/2) (87) 
Dh ity = eter tg (4) (88) 


lo 


where ¢,,(y) is the n“” eigenstate of the standard harmonic oscillator. As a reminder, 


n 2 Tm 
Qu 
Qrnini/4 dy” 

We refer to the different n, as “Landau levels.” The lowest Landau level (LL) has n, = 0, the 
next lowest has ny = 1, etc. States within a LL are indexed by kz. Since this can take on an infinite 
number of values (any real number), the LLs are infinitely degenerate. 

These basis states look rather different from the small circular orbits that we observe classically. 
They are completely delocalized in the x direction, while in the y direction they oscillate over a 
band of size O(lo) centered around a position depending on k,. Of course we could’ve chosen a 
different gauge and obtained eigenstates that are delocalized in the y direction and localized in 
the x direction. And on the pset you will see that the symmetric gauge yields something closer to 
the classical circular orbits. The reason these alternate pictures can be all simultaneously valid is 
the enormous degeneracy of the Landau levels. Changing from one set of eigenstates to another 
corresponds to a change of basis. 


3.3. The de Haas-van Alphen effect 


To make the above picture more realistic, let’s suppose our particle is confined to a finite region 
in the plane, say with dimensions L x W. For simplicity we impose periodic boundary conditions. 
This means that 


ky = =Ng, Nz CZ. 


We also have the constraint that yo should stay within the sample. (Assume that lop < L,W so we 
can ignore boundary effects.) Then 0 < yo < L, or equivalently 
WLqB 
2ahc 


< Ny <0. 


This implies that each LL has a finite degeneracy 


qB A BA © 
D=EWL : 
he 2nl2 = he/q = Bo 


(89) 


Here A = WL is the area of the sample, ® = BA is the flux through it and ®p = hc/e is the 
fundamental flux quanta (specializing here to electrons so q = —e). 


24 


Let us now examine the induced magnetization. In general the induced magnetic moment ji; is 
given by _~V gl. We can classify this response based on the sign of /i; - B. If it is positive we say 
the material is paramagnetic and if it is negative we say it is diamagnetic. (Ferromagnetism can be 
thought of as a variant of paramagnetism in which there can be a nonzero dipole moment even with 
zero applied field. This “memory effect” is known as hysteresis and comes from an enhanced spin- 
spin interaction known as the exchange interaction; cf. pset 8.) The standard examples of para- and 
diamagnetism are spin and orbital angular momentum respectively. At finite temperature spins 
will prefer to align with an applied magnetic field, thus enhancing it, while induced current loops 
(e.g. orbital angular momentum) will oppose an applied field. 

What is the induced magnetic moment for an electron in the n** LL? The energy is 


heB h 
E = hw,(n+1/2) = (Qn+1)=upBQn+1) per=——. 
2Mec 2MeC 
Thus the induced dipole moment is 
OE 


The minus sign means diamagnetism, corresponding to the fact that the circular orbits caused by 
a magnetic field will oppose that field. (We neglect here the contributions from spin.) The 2n + 1 
reflects the fact that higher LLs correspond to larger oscillations. 

So it looks like the quantum effects do not change the basic diamagnetism predicted by Maxwell’s 
equations, right? Not so fast! That was for one electron. Now let’s look at N electrons. The 
magnetism of a material is the induced magnetic moment per unit area, i.e. 


at OF tot 
A OB 


To calculate this we need to combine the total energy as a function of B. This is nontrivial 
because the degeneracy of each LL grows with B. Thus as B increases each electron in a given LL 
gains energy, but each LL can hold more electrons, so in the ground state some electrons will move 
to a lower LL. The competition between these two effects will give rise to the de Haas-van Alphen 
effect. Below is a sketch of what this looks like. 


vee 000 
Kono) 
000 
ron) 
Sas e000 
Spee se ee eae ae E=0 
low field high field 


Let v = N/D be the number of filled LLs. Recall that D = BA/®p. (Neglect spin in part 
because the B field splits the two spin states.) Since B is the experimentally accessible parameter 
with the easiest knob to turn (as opposed to NA), we can rewrite v as 

Bo No 
v= = 


aed 2 eee 
B eT A 


25 


de Haas-van Alphen oscillations in energy 


te) 0.5 1 1.5 2 25 3 3.5 4 


— Bo/B 


Figure 4: Ground-state energy E as a function of filling fraction, according to (90). 


The number of fully filled LLs is 7 = |v|, meaning the largest integer < v. Thus j7 <yv <j+1. 
The energy of the ground state is then 


j-l1 
E =) 0 Dhw1(n+1/2)+(N - jD)hw1(j + 1/2) 
n=0 —_—_—_————”’ 


partially filled level 
filled levels 


_ atten [1 9 Vis 
= 3 (Sheu+n+ (1-2) @+n), 


Using hw, /2 = we B = we Bo/v and ee +1) = 7? we obtain 


B= Nun (72° us”). (90) 


p2 


At integer points F/NupBo = 1. For v < 1 this equals 1/yv and for 1 < v < 2, we have Nie: a 


3 — =. In general there are oscillations at every integer value of v. This is plotted in Fig. 4. 


These oscillations mean that the magnetism M will oscillate between positive and negative. We 
calculate 


10E 1 
M=-——=- 27 +1 25(9 4-1) | 91 
ADB = nH ( J +1) — 5 25(5 ) (91) 
This is illustrated in Fig. 5. The key features are the oscillations between extrema of M = tng 


with discontinuities at integer values of v. Also observe that for vy < 1 all electrons are in a 
single LL, so we observe the simple classical prediction of diamagnetism, which we refer to here as 
“Landau diamagnetism” even though it is the only part of this diagram where the Landau levels 
do not really play an important role. 


26 


de Haas-van Alphen oscillations in magnetism 


andau diamagnetism 


0 0.5 1 1.5 2 2.5 3 3.5 


y= Bo/B 


Figure 5: Magnetism M as a function of filling fraction, according to (91). 


3.4 Integer Quantum Hall Effect 


The IQHE (integer quantum Hall effect) is a rare example of quantization that can be observed at 
a macroscopic level. The Hall conductance (defined below) is found to be integer (or in some cases 
fractional) multiples of e?/h to an accuracy of © 10~°. This allows extremely precise measurements 
of fundamental constants such as e?/h or (combined with other measurements) a = e?/he. 

The quantum Hall effect also lets us determine the sign of the charge carriers. 


The classical Hall effect. This was discovered by Edwin Hall in 1879. Consider a sheet of 
conducting materialin the x-y plane with a constant electric field E in the y-direction. 


y 


| 
Lee J _ E 


WwW 


As discussed above, if the mean time between scattering events if 7) then there is a drift velocity 


2 = . . . 2 
“TF. Thus the conductivity is og = “CL. 


Ga = +E giving rise to a current density j = nqvy = “4 


We can also define the resistivity po = 1/00. 


27 


Now let’s apply a magnetic field B in the 2 direction. This causes the velocity-dependent force 
F=qE+q-xB. (92) 
c 


This equation assumes that v < c, which we will see later is implied by the assumption E < B. 
To see that this assumption is reasonable, note that in units where h = 1,c = 1 an electric field of 
1V/cm is equal to 2.4-10~4eV? while a magnetic field of 1 gauss is equal to 6.9. 10~?eV’. 

If the velocity is the one induced by the electric field then the magnetic field causes a drift in 
the positive < direction, regardless of the sign of g. This means that the charge current does depend 
on the sign of g, which gave an early method of showing that the charge carriers in a conductor are 
negatively charged. 


y 


Lez} ©@ B 
L 


WwW 


In general the # velocity will build up until it cancels out the 7 component of the velocity, and 
the velocity will oscillate between the x and y components. These oscillations will be centered 
around the value for which the RHS of (92) is zero, namely $c@. 

We will argue more rigorously below that there should be a net drift in the @ direction. Plugging 
E = Ey, B = BZ into (92) we obtain 

B 
Nn, = eae (93a) 
Cc 
B 
Mby = qE — ai, (93b) 
Cc 


Using wy, = qB/mc we can rewrite this in matrix form as 


d [v 0 ol v 0) 
uae “Va? (94) 
t Vy —1 0 Dy oe 


In general if A is an invertible matrix and we have a differential equation of the form 


then we can rearrange as 
d “9 = 
qt A~'b) = A(@v+ A7'D). 


This has solution 


Figure 6: Motion of a charged particle subject to crossed electric and magnetic fields. 


which we can rewrite as 


ari) Attra -17\_ 4-17 
u(t) = e"((0) + Ab) -—A Db. (95) 
oscillation drift 
In our case 
0 il cos(wyt sin(wyt 0 il = 0 
A=wr , = enh) ony , At=wy, , = (96) 
-1 0 —sin(wyt) cos(wyt) -1 0 ae 


m 


We conclude that @ is a sum of an oscillatory term (which averages to zero) and a drift term equal 
to 
_ Ec, 
vet = He. 
Here vy = ze is the Hall velocity. (We see now why the v < c and E < B assumptions are 
related.) 
A third and slicker derivation is as follows. Let us change to a reference frame that is moving 
at velocity ¢ = vy%. (Suppose for now that we do not know vz.) To leading order in u/c the E ; B 
fields transform as 


Pape <ea5- sy (97a) 
Cc Cc 

Banoo ke = Gu By (97b) 
Cc Cc 


To obtain E’ = 0 we should choose vy = Ee. This means that in the 2z’,y’ frame we have pure 


circular motion, and therefore in the original x,y frame we have circular motion superimposed on 
a drift with velocity vyZ. 

The resulting motion is depicted in Fig. 6. 

This reasoning predicts a current in the ¢ direction known as the Hall current. The current 
density in the # direction is 


FE 
jx =(e density)(e~ charge) (velocity) = ng=e =onEy, 
where oy = ngc/B is the Hall conductivity. 


More generally we should think of conductivity as a matrix o with 7 = cE. Thus j, = onky 
but there may also be longitudinal conductivity which would yield a current j, = o, Ey. Since there 


29 


R, (arbitrary units) 


O | 2 2 
nhc/eB 


Figure 7: The dashed line is the classical prediction for the transverse resistance. We see instead 
that it rises quickly around integer values of v but is nearly flat (in fact, flat to within a factor of 
10-°) in between. The longitudinal resistance (and also conductivity) is mostly zero but jumps to 
something nonzero around the integer values of v. 


is no drift in the y direction we have o, = 0. (In fact this conductivity is exponentially small.) We 
can also show that j, = —oqE,. We then conclude that 


Jy —on O Ey 
eS 


(This matrix perspective is also useful when computing resistivity which is p = 01.) 
While the formula we have found is entirely classical, we can interpret in terms of various 
quantities from quantum mechanics. Recall that the filling fraction v = Bo a nhe = "5 d — : 


Here we have used that a9 = e?/h is the fundamental unit of conductivity. We thus predict that 


OH = V0Q. 


However, we find that the true picture is somewhat different. 


The quantum Hall effect. We now examine this problem using quantum mechanics. The 
Hamiltonian is 


We = (e- £4)" + qo. (98) 


Corresponding to our fields B = BZ and E = Eg we can choose A = —By# and ¢ = —Ey. We 
also neglect motion in the Z direction. This yields 


1 qBy\” . 2, 
H= Ey. 99 
sa (P» Cc ) Ts gy (99) 


30 


As with the dHvA effect we note that [H,p,| = 0 and we restrict ourselves to the subspace of 
wavefunctions of the form e“*** f(y). Using wy = qB/me and Ip = Vh/mwy, = /hc/qB, we then 
obtain 


2 
P 1 ho ie 
Hk. = 5 + smwt(y — yo)” — aEy yo = ky 
(100) 
2 
Py 1 ofa 2 2qE 
Fe he 2 101 
oa + 5m, (u YYo + Yo mae (101) 
2 E 
Py 1 2 a 2 qE gE BC _ UH 
=>— + 5mw mus (2 + = = = = 
dm 1 WL (y— Yo) — 5mup(2yoyr +41) YA age TN ge Ta 
(102) 
This looks again like a shifted harmonic oscillator, now centered at Y = yo + yi. There is also an 
additional energy shift, which can be simplified a bit. The first term —mw? yoy: = —mui7 lok, 2 = 


mut ke tH = hk,vy. The second term 5m yt = Smvy is simply the kinetic energy corre- 


sponding to the Hall velocity, albeit with a negative sign. 
We conclude that the energies are 


1 1 
Ekg ny = hwy, (4 + 5) + hopks — rue (103) 


This first term labels the Landau levels and the last is simply an overall constant. The middle 
term, though, breaks the degeneracy in k,. Thus the energy levels are no longer degenerate. We 
can think of the Landau levels as being “tilted” as follows. 


@ 
@o® 
@QOOOO00000 ‘ e002? 


@ 
9000008 


@Q@O00000000 eeren 


@O 
@e® 
@OOOO00000 oor? 
E=0 E#0 


Note that earlier we broke the symmetry between x and y somewhat arbitrarily, and using a 
different gauge would’ve resulted in wavefunctions that were plane waves in the y direction, or even 
localized states. This freedom was due to the extensive degeneracy of the Hamiltonian when E=0. 
Now that the degeneracy is broken we can really say that the energy eigenstates are plane waves 
in the x direction. There is a semi-classical explanation for this. Different values of y correspond 
to different value of the electric potential. Classically a particle in a potential experiencing a 
magnetic field will follow equipotential lines (i.e. contours resulting from making a contour plot 
of the potential). The quantum eigenstates are then superpositions corresponding to these orbits. 


3l 


This picture can be useful when considering the situation with disorder. In this case the potential 
looks like a random landscape with many hills and valleys. Most orbits will be localized but (it 
turns out that) there will be one orbit going around the edges that wraps around the entire sample. 
We will, however, not discuss this more in 8.06. 

We can rewrite (103) in a more intuitive way in terms of the center of mass Y = yo + yy. A 
short calculation shows that 


1 1 
Ee ny = hur @ + 5) — qEY + arr (104) 


Here we see the energy from the LL, the energy from a particle with center of mass Y in an electric 
field and the kinetic energy corresponding to the Hall velocity. 


We can also rewrite Y in a more intuitive way. Write Y = yo + y1 = oe ae and substitute 


B 
Px = MVyz + “(-By) = MVz — Ty 


to obtain 
Uz — UV 
Y= y— = = 
WL 

which is precisely the classical result we would expect from a particle undergoing oscillations with 
frequency wy on top of a drift with velocity vq. 

We can now calculate the group velocity ug = te of the eigenstates of H. From (103) we 
immediately obtain vg = vy, suggesting that each eigenstates moves with average velocity vz. 

Another way to measure the Hall current is to compute the probability current. On your pset 
you will show that this is S$ = Rew*tw, where t= 4(p- 4A). The charge current is then j = q9. 


Finally the wavefunction is 
tka 


w(x, y) = vw (y Yo Yi); (105) 


where @, is the n*® eigenstate of the harmonic oscillator. For the average current density in the 
sample we average over a vertical strip to obtain jayg = + i, jdy. 
As a sanity check we first evaluate 


eo 


since ¢,(y — Y) is real. On the other hand 


gf TO B 
Sz = Rep (+5 + ee) w (107a) 
1 hky 
= qlenly = yo — 1)? (= + us) (107b) 
1 
= Flony — yo — y1)/’wr(y — yo) (107c) 


To evaluate this last quantity observe that }én(y — yo — y1)|? is an even function of y — yo — y1 and 


32 


so we can write 


L ton FE : 
[Seay = $8 [buy — wo — sn) Pu — 0) (108a) 
0 0 
Wr ae 2 
ae lon(y — yo — y)I°(Y — Yo) (108b) 
W Jo 
=e bn Py — w — n+) (108c) 
= ar : n\¥Y — Yo Y1 Y—¥Yo—-Y1LTY1 
even odd 
a wi (108d) 
We conclude that ie =qryy = CE = 45 ¢. Thus the Hall conductivity is 
j qc qc hee? 1 ® 
Ga a es 7 ee tee) 
y 


This is the same as the classical result! 

To see the quantized conductivity we will need some additional arguments. While some of these 
are beyond the scope of 8.06, we will sketch some of these arguments briefly here. 

One argument that we can already make involves impurities/disorder. Thanks to impurities 
and disorder, many states are localized. We can think of the electron spectrum as containing not 
only Landau levels which are delocalized and can carry current but also many localized states which 
do not conduct. As we lower the B field the Landau levels move down and reduce their degeneracy. 
This has the same effect as raising the Fermi energy (see the wikipedia page on “Quantum Hall 
effect” for a nice animation). Sometimes the Fermi energy is between Landau levels and all we are 
doing is populating localized states, but when it sweeps through a LL then we rapidly fill a LL and 
the conductivity jumps up. 

This explains the plateaus in the conductivity but not why the conductivity should be an integer 
multiple of 09. We will return to this point later after discussing the Aharonov-Bohm effect. 


3.5 Aharonov-Bohm Effect 


To write down the Hamiltonian for a charged particle in electric and/or magnetic fields we seem to 
need the vector and scalar potentials A,@. But do these really have physical meaning? After all, 
they are only defined up to the gauge transform 


A'=A+Vf (110a) 
fash OS 
¢=¢9 ae (110b) 


On the other hand, E and B are gauge-invariant. Are these all we need? Can we build any other 
gauge-invariant quantity from the E and B fields? 

One attempt to construct a gauge-invariant quantity from the scalar potential A is to consider 
performing a line integral. Fix some path with endpoints 7, and 2 and define the line integral 


P= A-di. (111) 
path 


If we perform the transform in (110a) then we replace P with 


P=P+ f Vf-di= P+ f(#) — f(z). (112) 
path 


33 


solen Sc td 


| J 


(a) (b) 


Figure 8: (a) An infinite solenoid of radius R intersects a plane. (b) A cross-section of that plane. 
The magnetic field inside the solenoid is BZ. An electron and the curve C are entirely outside the 
solenoid. 


This is gauge invariant iff 7, = %. In this case the path becomes a loop and 


p= Ae af, Stokes thm | 


loop surface 


(V x i)-da= [ B-déa=©. (113) 
surface 

In the last step we have defined ® to be the magnetic flux through the surface. The last two 

quantities are manifestly gauge invariant. However, they are not local! In particular, P depends 

on A along the loop but the enclosed magnetic field might be in a very distant region. 

This is the fundamental tradeoff that we get with gauge theories. If we write our Hamiltonian 
in terms of gauge-covariant quantities like A, @ then we have enormous redundancy thanks to the 
gauge freedom. But if we try to describe our physics in terms of gauge-invariant quantities like 
E ; B then we give up locality, which is much worse. 


The Aharonov-Bohm thought experiment. In 1959 Aharonov and Bohm proposed the fol- 
lowing thought experiment in which an electron always stays within a region with B = 0 but A # 0 
and this nonzero vector potential has an observable effect. (In fact, an equivalent experiment was 
proposed by Ehrenberg and Siday in 1949, and was arguably implicit in Dirac’s 1931 arguments 
about magnetic monopoles.) 

The idea is to have current running through solenoid going from z = —oo to z = oo. The field 
inside the solenoid is BZ and the field outside the solenoid is 0. (If the solenoid is finite the field 
outside would be small but nonzero.) An electron is confined to the z = 0 plane and always remains 
outside the solenoid. In particular it only moves through a region where B = 0. 

However, the vector potential cannot be zero outside the solenoid. Indeed consider a curve C 
enclosing the solenoid such as the dashed line in Fig. 8(b). The loop integral of A around C is 


§ Adi= rR B= 40. (114) 

Cc 

The two-slit experiment. First we recall the two-slit experiment first introduced in 8.04. A 
2[.2 F . 

source S emits mass-m particles with energy E = — . They pass through a screen with two slits 


(labeled A and B) which are separated by a distance a before hitting a screen a distance L away. 
Suppose their position on the screen is y. (Let’s just consider one dimension.) There are two paths: 


34 


one from S$ to A to y and one from S to B to y. Denote their lengths by L4 and Lp. If L >a 
then a little trig will show that 
Lp- La =. (115) 


The wavefunction at the screen will be the sum of the contributions from both paths. We write 
this somewhat informally as 


Vy =VS>ArYy)+V(S> By). (116) 
Since a particle traveling a distance d picks up a phase of e’*¢, we find that 
wb(y) ox ebbak 4 efbak (117) 


The probability of finding the particle at position y is then 


|W(y)/? x cos” (fase = cos” (Se) (118) 


35 


Probability density for the two-slit experiment 


The two-slit experiment with non-zero vector potential. Suppose now that our particle 
has charge gq. Suppose further that the two paths of the two-slit experiment go around an infinite 
solenoid, so that the particle experiences a nonzero vector potential but zero magnetic field. 

We will need a prescription for solving the Schrédinger equation in a region where B = 0 but 
A may be nonzero. To do so, let us fix a point Zo, which in our scenario will be the location of the 
source S. Then given a curve C' from 2p to Z, define 


g(@,C) = ie A- di. (119) 
along C 


We claim that under some conditions g(z, C’) is independent of C and can be expressed as a function 
only of z. In other words 


g(&,C) = g(Z). (120) 
These conditions are that we restrict our curves C' and our points # to a simply-connected 
region of space in which B = 0 everywhere. Here “simply connected” means that there is a path 


between any two points and any loop can be continuously contracted to a point. We will see in a 
minute where this requirement is used. 


36 


# | 5 
Ca 
Figure 9: Two curves from Zo to # and the region enclosed by them. 


To prove (120), consider two curves C), C2 with the same endpoint. Then 


a > q 7 a 6 > oS 
9(#,C1)— (#02) =F J gg AU - 7 fag A (121a) 
along Cj along C2 
_4 fats 4 i. a 
=p fan Ate | ey, Ae (121b) 
along Ci along C2 
eS tar 
ic | mF a (121c) 
along C}; then C2 
-19 (121d) 
he 


In this last equation, ® is the flux enclosed by the loop made up of C; and C9; see Fig. 9. 
Next we can use the function g(Z) to construct the solution of the Schrédinger equation in a 
simply connected region with B = 0. 


Claim 1. Let yp) (z, t) be a solution of the free Hamiltonian HO = p?/2m and suppose that 
py) (Z,t) has support entirely in a simply connected region where B = 0. Let H = (p— 4A)? /2m 
and let w(#,t) be the solution of the Schrodinger equation Hw = iho,w. 


(2, t) = 9 YO (Zz, t). (122) 
Note that the claim relies implicitly on the fact that we can write g solely as a function of 7. 


Proof. Recall that for any f(Z) we have [p, f] = —~ihV f. Then 


[p, 9) = —iAVe'9s® (123a) 
= heIOVg (123b) 
= nels) “ A@) (123c) 
= 1 Reis) (123d) 

Cc 


(e- 14) e'9 = IP. (124) 
This implies that 
het = — = (e- 44) e9 = e909? = oo, (125) 
c 


Figure 10: A two-slit experiment is conducted with a solenoid between the two paths. We define 
two simply connected regions A and B, both of which include the source and the screen where we 
observe interference. Region A contains paths S > A — y and region B contains paths S$ > B > y. 


Thus we have (using also the fact that 0:g = 0) that 


Hy = Hep = HOY = 9 HA YO = inde = ihdxr). (126) 


We can now use Claim 1 to analyze the two-slit experiment in the presence of a magnetic field. 
Suppose that we add a solenoid in between the two paths, as depicted in Fig. 10. Then we can 
define regions A and B which (a) are both simply connected, (b) both contain the source S and 
the target y (for all values of y) and (c) region A contains point A and region B contains point B. 
This will allow us to use (122) separately in each region. 

Let ga and gg denote the functions g restricted to regions A and B respectively, and let 
YO(S 4 A> y) and YO (5 > B= y) be the solutions of the free Schrédinger equation in those 
regions. This means that 


ga(y) = Z| A-dé and gza(y)= zf A. dé. (127) 
he SA>y he S-+Boy 


Then from (122) we have 


W(S > A y) = exp(iga(y))~©(S 4 A y) (128a) 
W(S + B- y) =exp(iga(y))y(S + B= y) (128b) 


Again the total amplitude will have an interference term which is now shifted by g4 — gp. Indeed 


ae 9 i 7k 
Joa)? ox [etbateta + tekcion|” — cos? (5 (“SH + (au - gn) )). (129) 


38 


This new phase shift is 


qd 77 @ 
gaA(y) —- ga(y) = — Adt = — 9, 130 
) aly) he Poocuor ops he ( ) 


where ® = 7R?B is the enclosed flux. If q = —e then 7 = —2n®p and the phase shift is Ine. 
The resulting interference pattern is then 


joa) oc= ovs? (5 (HY — an F) ). (131) 


Probability density for two-slit experiment with flux ® 


b> ‘ ‘a, 
WT 27 37 
i 


Figure 11: Observed interference pattern when performing the two-slit experiment with a magnetic 
flux ® enclosed between the two paths. 


The resulting interference pattern is depicted in Fig. 11. Observe that the experiment is only 
sensitive to the fractional part of ®/®p. In that sense we find again a periodic dependence on the 
magnetic field. 

On pset 10 you will explore a related phenomenon, this time for the energy levels of a time- 
independent Hamiltonian. A particle confined to a ring with a magnetic flux through the ring will 
have its energy eigenstates shifted by an amount that again depends only on the fractional part 
of &/®p. (This problem in turn is related to problem 2 on pset 5 which was a time-independent 
version of the Berry phase.) 

In both cases, if ® is an integer value of ®g then this cannot be distinguished from there being no 
magnetic field. This is related to a deep property of electromagnetism, which is rather far beyond 


39 


the scope of 8.06. While we have presented A and od as real numbers, they can also be thought 
of as elements of the Lie algebra u(1) which generates the Lie group U(1). Electromagnetism is 
known as U(1) gauge theory for this reason. Non-abelian gauge theories are also possible. Indeed 
the Standard Model is a U(1) x SU(2) x SU(3) gauge theory, with the U(1) part corresponding to 
electomagnetism, the SU(2) part to the W and Z bosons in the weak force and the SU(3) part to 
gluons in the strong force. 


3.5.1 The IQHE revisited 


Finally we review an argument by Laughlin and Halperin for explaining charge quantization in the 
IQHE. 

First we need to review a concept known as the Thouless charge pump. Suppose we have a 
1-d system containing some number of electrons. As 0 < t < T suppose we adiabatically change 
the Hamiltonian from H(0) to H(T), and suppose further that H(T) = H(0). In other words we 
return to the Hamiltonian that we start with. Then the net flux of electrons from one end of the 
system to the other will be an integer. This is because, other than the endpoints, the state of the 
system should remain the same. We will see below how this applies to the IQHE. 

In Section 3.4 we considered electrons on a sheet. Instead we will use an annulus geometry as 
depicted in Fig. 12. 


Figure 12: Hall effect on an annulus. There is still a B field in the Z direction, but now the electric 
field points in the db direction. We can think of it as being induced by a time-dependent flux in the 
center of the annulus. The Hall current now flows radially outward and can be measured by the 
amount of charge flowing across the contour C. 


Suppose that the Hall conductivity is oy = roo for some unknown x. We would like to show 
that x is an integer. 
To induce an electric field in the ¢ direction we will apply a time-dependent flux in the center 


AO 


of the annulus. If this flux is ® then the vector potential at radius r is 


> @ » 
fa (132) 
The electric field is then 7 10A 1 0®. 
is cot | 2Qnrc ot o: oa 
The Hall current is then 7 0® 
te = oa BE as 


Now suppose we adiabatically increase ® from 0 to ®g. By the results of the pset, the final 
Hamiltonian is the same as the initial Hamiltonian (up to a physically unobservable gauge trans- 
form), and by the Thouless charge pump argument, this means that an integer number of electrons 
must have flowed from the inner loop to the outer loop. (Note this integer could be zero or negative.) 

Let AQ be the amount of charge transferred in this way. Then 


T T e 
sh 
AQ =| de at =| Qnrjydt = Qa hy = EB = ae. (135) 
9 dt 0 277 ce 


We conclude that x is an integer, as desired. 


Al 


8.06 Spring 2015 Lecture Notes 
5. Quantum Computing 


Aram Harrow 


Last updated: May 23, 2016 


Quantum computing uses familiar principles of quantum mechanics, but with a different phi- 
losophy. The key differences are 


e It looks at the information carried by quantum systems, and methods of manipulating it, 
in ways that are independent of the underlying physical realization (i.e. spins, photons, 
superconductors, etc.) 


e Instead of attempting to describe the world around us, it asks what we can create. As a result, 
features of quantum mechanics that appear to be limitations, such as Heisenberg’s uncertainty 
principle, can be turned into new capabilities, such as secure quantum key distribution. 


The approach is of course analogous to classical computing, where the principles of computing 
are the same whether your information is stored in magnetic spins (e.g. a hard drive) or electric 
circuits (e.g. RAM). After the modern notion of a computer was invented in the 1930s by Alan 
Turing, Alonzo Church, and others, many believed that all computing models (ranging from modern 
CPUs to DNA computers to clerks working with pencil and paper) were fundamentally equivalent. 
Specifically, the Strong Church-Turing thesis held that any reasonable computing model could 
simulate any other with modest overhead. Relativity turns out to still be compatible with this 
hypothesis, but as we will see, quantum mechanics is not. 


1 Classical computing 


Model a classical computer by a collection of m bits 71,...,2%m upon which we perform gates, each 
acting on O(1) bits. For example, the NOT gate acts on a single bit, maps 0 to 1, and 1 to 0. The 
NAND gate means “not-AND” is defined by 


x y x NAND y 
0 0 1 
0 1 1 
1 0 1 
1 1 0 


The NAND gate is useful because it is computationally universal. Using only NAND gates, we can 
construct any function on n-bit strings. 

Here, we can imagine the output of a gate overwriting an existing bit, but it is often simpler 
to work with reversible computers. The NOT gate is reversible. Another example of a reversible 
gate is the controlled-NOT or CNOT, which maps a pair of bits (x, y) to (a,~@®y). Here @ denotes 
addition modulo 2. 


x rey 
0 0 
0 1 1 
1 0 1 
1 1 0 


For reversible circuits, computational universality means the ability to implement any invertible 
function from {0,1}" (the set of n-bit strings) to {0,1}". It turns out that NOT and CNOT gates 
alone are not computationally universal (proving this is a nice exercise; as a hint, they generate 
only transformations that are linear mod 2). For computational universality in reversible circuits, 
we need some 3-bit gate. One convenient one is the Toffoli gate (named after its inventor, Tom 
Toffoli), which maps (2, y, z) to (x,y, z@xy). If we start with (x, y, 1), then applying the Toffoli gate 
yields (x,y,z NAND y). Since the Toffoli gate can simulate the NAND gate, it is computationally 
universal. 

There are not significant differences between the reversible and irreversible models, but the 
reversible one will be a more convenient way to discuss quantum computing. 


1.1 Complexity 


The complexity of a function is the minimum number of gates required to implement it. Since the 
exact number of gates is not very illuminating, and will in general depend on the details of the 
computational model, we often care instead about how the complexity scales with the input size 
n. For example, the time to multiply two n-bit numbers using the straightforward method requires 
time O(n”), since this is the number of pairs of bits that have to be considered. The true complexity 
is lower, since a more clever algorithm is known to multiply numbers in time O(n log(n)). Factoring 
integers, on the other hand, is not something we know how to do quickly. The naive algorithm 
to factor an n-bit number x requires checking all primes < ,/x, which requires checking ~ gn/2 /n 
numbers. A much more complicated algorithm is known (the number field sieve) which requires 
time o(2n"”*), and modern cryptosystems are based on the assumption that nothing substantially 
faster is possible. 

In general, we say that a problem can be solved efficiently if we can solve in time polynomial 
in the input size n, i.e. in time < n° for some constant c > 0. So multiplication can be solved 
efficiently, as well as many other problems, like inverting a matrix, minimizing a convex function, 
finding the shortest path in a graph, etc. On the other hand, many problems besides factoring are 
also not known to have any fast algorithms, of which one example is the “taxicab-ripoff problem” , 
namely finding the longest path in a graph without repeating any vertex. 

The strong Church-Turing essentially states that modern computers (optionally equipped with 
true random number generators) can simulate any other model of computation with polynomial 
overhead. In other words “polynomial-time” means the same on any platform. 


1.2 The complexity of quantum mechanics 


Feynman observed in 1982 that quantum mechanics appeared to violate the strong Church-Turing 
thesis. Consider the state of n spin-1/2 particles. This state is a unit vector in (C?)®” = C?", 
so to describe it requires 2” complex numbers. Similarly solving the Schrédinger equation on this 
system appears to require time exponential in n on a classical computer. 


Already this suggests that the classical model of computing does not capture everything in the 
world. But Feynman then asked whether in this case a hypothetic quantum computer might be 
able to do better. On pset 10, you will have the option of exploring a technique by which quantum 
computers can simulate general quantum systems with overhead that is polynomial in the space 
and time of the region being simulated. 


2 Quantum computers 


2.1 Qubits and gates 


A quantum computer can be thought of as n spin-1/2 particles whose Hamiltonian is under our 
control. A single spin-1/2 particle is called a qubit and we write its state as ao|0) + ai|1) (think of 
\0) = | t) and |1) = | J) if you like, but |0) = |ground), |1) = |first excited) works equally well). 
The state of two qubits can be written as 


aoo|00) + ao1|01) + aio| 10) + ayii|11), 


where we use |2122) as a shorthand for |x,) @ |v). And the state of n qubits can be written as 


lyy= So agla). 


LE{0,1}” 


The power of quantum computing comes from the fact that acting on one or two qubits is 
equivalent to applying a 2” x 2” matrix to the state |W). Let’s see concretely how this works. 

If we have a single qubit and can apply any H we like for any time t, we can generate any 
unitary matrix U = e~*“'. (Use units where fi = 1.) This is because any unitary matrix U can be 
diagonalized as U = oe e¥i|v;)(v;|, and so applying the Hamiltonian >> j ~9j|vj)(v;| for time 1 
will yield U. 

What if we have two qubits? If we apply A to the first qubit then this is equivalent to the 
two-qubit Hamiltonian H @® Ig. The resulting unitary is 


V =e (48) _- “#gl-Uel. 


Similarly if we apply H to the second qubit, then this corresponds to the two-qubit Hamiltonian 
I ®H, which generates the unitary operation J ® U. If we instead apply a two-qubit Hamiltonian, 
then we can generate any two-qubit (i.e. 4 x 4) unitary. 

This does not scale up to general n-qubit unitaries. Instead we perform these by stringing 
together sequences of one and two-qubit gates. Some important one-qubit gates (each of which is 


both Hermitian and unitary, meaning their eigenvalues are 1,—1) are 


0 1 
A= 6% = 

1 O 

O -i 
Y=oy,= 

i 0 

1 O 
LZ =z = 

0 -1 

tft A 1 

i= SS (12) y| 


“7 \y.i) 9 


The last gate H is known as the Hadamard transform and plays an important role in quantum 
computing. It has the useful properties that HX H = Z and HZH = X. 
An important two-qubit is the controlled-NOT gate, which is discussed further on pset 10. 
How do these look when acting on n qubits? If we apply a one-qubit (2 x 2) gate U to the j*® 
qubit, this results in the unitary 


x,ye{0,1} 


@j-1 @n-j 
i OU Oi. 
If we apply a two-qubit gate V to qubits 7,7 +1, then we have the unitary 
I eV ei 
There is no reason two-qubit gates cannot be applied to non-consecutive qubits, but it requires 


more notation, so we will avoid this. 
One fact we will use later is that 


He = (I eal@ = ()™eeal@-@ SD 1)? an) 
v2 v2 
1 


£1,y1€{0,1} x2,y2€{0,1} Ln ,yn€{0,1} 


=Se DL (anette la (a 
H,YE{0,1}” 
1 


E,GE{0,1}” 


2.2 Quantum computing is at least as strong as classical computing 


I claim that one and two-qubit gates can implement the quantum Toffoli gate. There is a construc- 
tion with 6 CNOT gates and many single-qubit gates which I will not describe here. This means 
that given the ability to do arbitrary one- and two-qubit gates, a quantum computer can perform 
any classical reversible computation with asymptotically constant overhead. 

This is good to know, but it is in going beyond classical computing that quantum comput- 
ers become interesting. On the other hand, we may often want to use classical algorithms as a 
subroutine. 

If f : {0,1}" — {0,1}” is a reversible (i.e. invertible) function, then the corresponding unitary 


U= SY) lf(@) al. 


HE{0,1}” 


4 


3  Grover’s algorithm 


Grover’s algorithm shows a quantum speedup for the simple problem of unstructured search. Sup- 
pose that f is a function from {0,...,N — 1} — {0,1} and our goal is to find some x for which 
f(x) = 1. How many times do we need to evaluate f to find such an x? For simplicity, assume 
that there is a single w for which f(w) = 1. (w stands for “winner.”) Evaluating f(x) for random 
x will find w in an expected N/2 steps. 

Remarkably, quantum computers can improve this by a quadratic factor, and can find w using 
O(VN) steps. 


The algorithm has two ingredients. First, we need to be able to implement the unitary 


A= S$ (-1!|2)(e| = 1 — 2\w)(w. 


O0<a<N 


An explanation of how to do this is on pset 10. The unitary A can be thought of as a reflection 
about the state |w). 
Next, we will need to perform the reflection 


B= 2\s)(s| — J, 


where |s) = Ta o<r<n |®) is the uniform superposition state. This is closely related to the 
problem of preparing |s). If N = 2", then H®"|0") = |s). (Here 0” is the string of n zeroes.) 
Similarly B = (H®”)(2|0")(0"| — I)(H®"). Thus the problem of implementing B reduces to that 
of implementing J — 2|0)(0|. This in turn, reduces (again using pset 10) to the problem of detecting 
when a string is equal to the all-zeroes string, which is an easy classical computation. 

Finally we can describe Grover’s algorithm. It is 


e Start with |s). 
e Apply AB T times, where T will be determined later. 


e Measure. 


The key to analyzing Grover’s algorithm is that we never leave the two-dimensional subspace 
spanned by {|w),|s)}. These vectors are linearly independent, but not orthogonal. To construct 
an orthonormal basis from them, define 


1 
|v) = Jari 2 


so that |s) = ,/>+|v) + Tal): 
In the {|v),|w)} basis we have 


1 0 
A= 
0 -1 
N-1 Not 1-2 2VN=I 
B=2|s)(s| I= e N )-I= Be, ate 
Nol 1 2/N=I 24 
N N N N 
Apis: 1-z . x _ cos(#) — sin(@) 
ave 1-z —sin(0) cos(@) 


Or 


If N is large, then 6 = sin-!(2/N —1/N) = 2//N. Thus, after T = af ~ TVN steps, we have 
rotated from |s) = |v) to © |w), and measuring will have a high probability of returning the correct 
answer. 

An alternate explanation of this rotation by O(1/VN) comes from the fact that we have reflected 
across two lines which are at an angle of O(1/VN) with each other. See the blackboard for a 
diagram. 


4 Simon’s algorithm 


Grover’s algorithm reduced O(N) time to O(WN) for a very general problem. Shor’s algorithm 
offers a more dramatic speedup for a more specialized problem: finding the prime factors of a large 
integer. As described above, the best known classical algorithms for factoring n-bit numbers require 
time slightly larger than gut? By contrast, a quantum computers could factor an n-bit number 
using Shor’s algorithm in time O(n?). 

The key ingredient of Shor’s algorithm is period-finding, meaning the problem of finding a given 
the ability to evaluate a function f for which f(x) = f(x +a) for all x. This is straightforward, 
but takes time, so first I'll explain the algorithm which was one of the key pieces of inspiration for 
Shor’s algorithm. 

Let f be a function on {0,1}” with the promise that f(z) = f(y) if and only if @ ¥ = @ for 
some secret @ € {0,1}”". The goal of the problem is to find 4d. 

Define Zz = {0,1} with + meaning @. We can check that +,- satisfy all the usual properties 
of addition and multiplication, and so Z} = {0,1}”" is a vector space. We will see later why this is 
a useful move. 

Consider the following algorithm. 


e Start with the uniform superposition 


1 
H®"90") = S- 2). 
vee #EZ3 


Compute f and store the answer in a second register. 
1 
== DL If) @lf@). 
2” sen 
e Measure the second register. Suppose the answer is a. Then we are left with a superposition 
over all Z such that f(Z) = a. By assumption, this state must have the form 
|Z) + |£ + @) 


ae 


Apply H®” to obtain 
1 Pe i, 
So ((-)79 + (-DSt I. 
ont a 


y 


e Measure, and obtain outcome y with probability 


1 23 si, ifd-g=0 
Pry] = wae + (-1)*#) = f - : 


Thus, this procedure gives us a random vector 7 that is orthogonal to a. After we have collected 
n — 1 linearly independent such y we can determine @ using linear algebra on Z5. After we have 
collected k < n—2 vectors 7),..., 7"), the probability that a new gis in the span of 7),... , x) 
is 2"-* < 1/4. Thus we need to run this entire procedure O(n) times in order to find @ with high 
probability. 

By contrast, a classical algorithm would require > 2"/? evaluations of f in order to have a 
non-negligible chance of finding a. 


5 Shor’s algorithm 


Suppose we'd like to factor a large number N, say 2048 bits long. One approach is to try dividing 
by all primes up to VN. This takes a long time; on the order of 2!°?4 for our example. The best 
known classical algorithm is the General Number Field Sieve, which runs in time exp(log(N)!/3 - 
poly (log log(N))), and has not yet been used on numbers larger than 768 bits. By contrast, Shor’s 
factoring algorithm runs in time O(log?(N)). 

There are three main components to Shor’s algorithm. 


1. First factoring is reduced to “period-finding” which means finding an unknown period of a 
black-box function. This step uses number theory and not quantum mechanics. 


2. A quantum algorithm for period-finding is given. It is similar to Simon’s algorithm but instead 
of the Hadamard it uses a transform called the Quantum Fourier Transform (QFT). 


3. Finally we describe an efficient circuit for the QFT. 


We begin with the number theory part. 


Euclid’s Algorithm. Given two integers y and z you can quickly find the greatest common 
divisor gcd(y, z). 

Given N choose a random 1 < a < N and assume gcd(a, N) = 1. (If not, we have already found 
a nontrivial factor!) Define ord(a) to be the least positive integer r such that a” mod N = 1. Is 
there always such an r? Yes, consider the sequence a, a? mod N,a® mod N,... This sequence must 
eventually repeat. If a” = a¥ mod N and y > « then a¥~* = 1 mod N. (This is not obvious but 
can be proved using tools like the Chinese Remainder Theorem.) 

Now suppose we have found r = ord(a). This means that a” — 1 = mN for some integer m. 
Suppose that 


e r is even, and 
e a’/* +1 is not a multiple of N. 


(If either of these fails, try another a. It turns out there is a reasonable probability that one of 
these holds.) Suppose even more concretely that N = pr wi pr. Then 


(a"/? — 1)(a"/? +1) = mp}... ph! (1) 
and since a’/? +1 is not a multiple of N there must be some p; that divides a’/2 —1. We can now 
compute gcd(a’/? — 1, N) using Euclid’s algorithm and we will find a nontrivial factor of N. 

That’s it for the number theory. Now how do we find ord(a)? We use period finding! Define 


f(x) =a” mod N. (2) 


This has period r. Indeed 
f(x@+r) =a*™*" mod N = a*a" mod N = a* mod N = f(z). (3) 


The intuition behind period finding is similar to what we did for Simon’s algorithm, although 
now the Fourier transform is instead more like the traditional (i.e. U(1)) Fourier transform. Here’s 
how this works. Choose n such that N? < 2” < 2N?. We will evaluate f(x) on x € {0,1,...,2"—1}. 

Before describing Shor’s algorithm we present a subroutine, the Quantum Fourier Transform. 


Dial: 


1 Tixy /2” 
Ugrrla) = Te Do eee ly). (4) 
y=0 


This is a unitary transformation and I claim it can be performed using O(n”) gates. Let’s see how 
we use it first and then show how to perform it next. 
The quantum part of Shor’s algorithm consists of these steps: 


e Prepare the state Wo yo |x) @ |0). 
e Calculate f(x) in the second register to obtain 


21. 


Sa Dla) @ Ife) 
xz—0 


e Discard the second register. (But we just calculated it!) We can think of this as measuring 
the register and ignoring the answer. The first register will be in a superposition of the form 


C(\xo) + |ao +r) + lao + 2r) +...). 


for a random choice of xg. Let’s make this more precise and say the state is 


m-1 
sa DL hto + br), (5) 
k=0 


with m = 2"/r. At this point our superposition has period r. Think of this as a state with 
period r in time. If we Fourier transform we should get a peak in frequency proportional to 
1/r (in fact 2”/r for our discretized Fourier transform). Let’s now do the math and see that 
this happens. 


e Apply Ugrr. This yields 


1 m—12"-1 
7 > Cai ile ok ca (6) 
ere k=0 y=0 
e Measure y. We find 
i m1 ge 2 
mikry 
Pry]=— |e (7 
k=0 


To get some intuition for (7), suppose y ~ 2"j/r for some integer 7. Then the phase will 


be © 1 and we will get Pri[y] ~ m/2” = 1/r. Let’s argue this more precisely. Observe that 
Boia Es zk = (1—2™)/(1— 2). Consider y = 2"j/r + 6; for |5;| < 1/2. Then 


2 2 
: | | l 1— exp (a) l 1— exp (Aa ) : 
r = — 
y m2” fe exp (252) m2n qi exp ee ) ( ) 


Using mr ~ 2” and |1 — e’®| = 2sin(@/2) we have 


1 sin? (76; 1 (16;/(m/2))? 41 
Priyl © on attain = = ie ee 


Therefore with probability > 4/1? we obtain j2”/r from which we can extract r. 


Extract r using more number theory. This part is a classical computation so I will not dwell on 
it. From the last step we obtain & = 44 54 for an unknown integer j and unknown |6;| < 1/2. 
Since r < N and N? < 2”, Gnding the best rational approximation with denominator < N 
will do the job. This “best rational approximation” can be found by something called the 


continued fraction expansion, which we illustrate with an example. 


49 1 1 1 


1 
~~ 300 6 Le 1 
300 6+ Hh Ot we Ste 


Rational approximations are obtained by truncating this series, e.g. 1/(6 + 1/8) = 8/49 = 


49/300. 


For the last twenty years there have been many attempts to generalize Shor’s algorithm to 
other groups. In some cases these would break other cryptosystems. It is often possible to find 
efficient quantum Fourier transforms for other groups but finding analogues of the continued fraction 


expansion and the other ingredients has been elusive. 


Finally let’s talk about how to implement ses The superscript denotes the number of qubits. 


When n = 1 this is just our old friend the Hadamard transform. 


i ‘ie 4 
Cc 
ee WD os 


When n = 2 we get fourth roots of unity, namely +7 in addition to +1: 


i © aw @ 
(Wie e ee oy 
V414-1 1 -1 


1 -7 -1 3 


y) 
Use a 


In general the matrix elements of Ween ") look like exp(27ixy/2”). Let’s see how this looks in terms fo 


the bits of x and y. Write x = 79 +27, +4a2+...+2"~!a,_} and similarly y = yot...+2"7ly 


Then 
xy mod 2 = xoyo + 2(x1Y0 + xoy1) SE ee 2” (xoyn—1 SF ast e aa )s 


n—1- 


Yo 


In-1 A Ro Rs tee Rn-1 YO 
Tn—2 | r— YI 
In—3 e (n-1) |— Y2 
Uorr 
x0 e [— Yn—1 


Figure 1: Quantum circuit implementing the quantum Fourier transform. 


Write x = 2°-!a,_1 + % and y = 29 + yo. Then 


1 Qrivy 1 2ni2”—le, 140 271ZYO 2ni2"—1y,, 129 QWHiZ2y (ca ae 2ntFv0 
az — Te TU TU Te wi, JS te OS 
em = € 2 e 2 e 2 eI Vy eee Fae 


¥ v2 


This suggests a recursive algorithm for oe which we express as a circuit diagram. See Fig. i. 


There we have defined the rotation gate 


Ry = , eori/2 ‘ (10) 


10 


MIT Quantum Theory Notes 


Supplementary Notes for MIT’s Quantum 
Theory Sequence 
©R. L. Jaffe 1992-2006 
February 8, 2007 


Canonical Quantization and Application to 
the Quantum Mechanics of a Charged 
Particle in a Magnetic Field 


OR. L. Jaffe MIT Quantum Theory Notes 


Contents 
1 Introduction 


2 Canonical Quantization 
2.1. “Thewanonical method o024. esr 2 epee e dea Raa es 
22: pimple Pagan ples- saa a haks a aie kh Bee otek & eee So 
Zieh’ Bead On a. VW ites ask af as secany ices ete ates ete a ee ee 
2.2.2 Relative and Center of Mass Coordinates. ....... 
De® NWALAIIES elas poche «dal on axle be end, hee dee, Oe atin he ee SE 
2.3.1 Quantum variables without classical analogues... . . 
2.3.2 Operator ordering ambiguities .............. 
JOco: SIGUE POMIGS - pa. eg 2 aR Aled Ae eee 
2.34. Gonstramed systems. ./a-. 4 bea a eR eR a es 


3 Motion in a Constant Magnetic Field — 

“Landau Levels” 

Gal... AMiTOCUGHON: ors 2516 o> nak ae et Geek oe eet hes be Soe es 

3.2 Lagrangian, Hamiltonian and Canonical Quantization .... . 

3.3 A solution to the quantum equations of motion ........ 

3.4 Physical Interpretation of Landau Levels ............ 
3.4.1 The location and size of Landau levels ......... 
3.4.2. A more careful look at translation invariance. ..... 


4 The Aharonov Bohm Effect 


5 Integer Quantum Hall Effect 
5.1 The ordinary Hall effect and the relevant variables. ..... . 
5.2 Electrons in crossed electric and magnetic fields ........ 
O20 ssetting ap the problem: 2 .0-ip.s.oea » eS Bs 
5.2.2 Eigenstates and eigenenergies .............. 
Dead WeBeneracles is: osu sa wa ee ow ae eek bap Sole Bae ea 
p24, “Whe Pallcurreny 2-6 a aches oe ee ee Ye 
5.2.5 <A description of the integer quantum Hall effect ... . 


©R. L. Jaffe MIT Quantum Theory Notes 3 


1 Introduction 


Classical mechanics is internally consistent. No amount of examination of 
Newton’s Laws as an abstract system will lead you to quantum mechanics. 
The quantum world forced itself upon us when physicists tried and failed to 
explain the results of experiments using the tools of classical mechanics. It 
took and still takes considerable guesswork to find the proper description of a 
new quantum system when first encountered. Notions like internal spin and 
the Pauli exclusion principle have no analog whatsoever in classical physics. 

However, the equations of motion of quantum mechanics, looked at from 
a particular point of view, resemble the Hamiltonian formulation of classical 
mechanics. This similarity has led to a program for guessing the quantum 
description of systems with classical Hamiltonian formulations. The program 
is known as “canonical quantization” because it makes use of the “canonical” 
i.e. Hamiltonian, form of classical mechanics. Though it is very useful and 
quite powerful, it is important to remember that it provides only the first 
guess at the quantum formulation. The only way to figure out the complete 
quantum mechanical description of a system is through experiment. Also, 
recall from 8.05 that there are many quantum mechanical systems (like the 
spin-1/2 particle, for example) whose Hamiltonians cannot be obtained by 
canonically quantizing some classical Hamiltonian. 

In lecture this week, we will apply the method of canonical quantization 
to describe the motion of a charged particle in a constant magnetic field. In 
so doing, we shall discover several beautiful, and essentially quantum me- 
chanical, phenomena: Landau levels, the integer quantum Hall effect, and 
the Aharonov-Bohm effect. Our treatment will be self-contained, and thus 
of necessity will barely scratch the surface of the subject. We will, how- 
ever, be able to grasp the essence of several key ideas and phenomena. This 
part of 8.059 serves as an introduction to the condensed matter physics of 
electrons in materials at low temperatures in high magnetic fields, which is 
a vast area of contemporary experimental and theoretical physics. The in- 
teger and fractional quantum hall effect (we shall not treat the fractional 
case) were both discovered in experiments done the 1980’s, and were among 
the biggest surprises in physics of recent decades. They have been the sub- 
ject of intense investigation ever since, including by many physicists here at 
MIT. Furthermore, understanding gauge invariance and phenomena like the 
Aharonov-Bohm effect are key aspects of the modern understanding of the 
theories that govern the interactions of all known elementary particles. 


©R. L. Jaffe MIT Quantum Theory Notes 4 


Section 1 describes the method of canonical quantization. We studied 
this in 8.05. I will not lecture on the material of Section 1; I am providing it 
here for you to read and review. 


OR. L. Jaffe, 1998 


2 Canonical Quantization 


2.1 The canonical method 


There is a haunting similarity between the equations of motion for operators 
in the Heisenberg picture and the classical Hamilton equations of motion in 
Poisson bracket form. 

First let’s summarize the quantum equations of motion. Consider a sys- 
tem with N degrees of freedom. These could be the coordinates of N/3 
particles in three dimensions or of N particles in one dimension for exam- 
ple. Generically we label the coordinates {x;} and the momenta {p;}, where 
j =1,2,...N. We denote the Hamiltonian H(z, p), where we drop the sub- 
scripts on the x’s and p’s if no confusion results. From wave mechanics where 
p; is represented by —ihO/0z,, 


ysaee| Se 80 
[p;,pze| = 0 
[zj, Pr] = thdjx (1) 


From Ehrenfest’s equation (for a general operator), A(t), in the Heisenberg 
picture, 


OA 
ih = |A,H] (2) 


(where we suppress the subscript H on the operator) we can obtain equations 
of motion for the {x;} and {p;}, 


tha ; = Fas H] 


ihp; (3) 


I| 
= 
= 


©R. L. Jaffe MIT Quantum Theory Notes a 


In the particular case where H = )7,_, y(pj/2m)+V(z), it is easy to extend 
the work we did in the case of the harmonic oscillator to obtain from (3) 


itt, SE SOE 

Pi = On; Ox; 

: _ OH pj 

mae aaa (4) 


Equations (4) look exactly like Hamilton’s equations. When the two lines 
are combined, we obtain Newton’s second law, mz; = —OV/0x,;. Of course, 
we have to remember that the content of these equations is very different in 
quantum mechanics than in classical mechanics: operator matrix elements 
between states are the observables, and the states cannot have sharp values of 
both x and p. Nevertheless (4) are identical in form to Hamilton’s equations 
and the similarity has useful consequences. 

In fact, it is the form of (2) and (3) that most usefully connects to classical 
mechanics. Let’s now turn to the Poisson Bracket formulation of Hamilton’s 
equations for classical mechanics. We have a set of N canonical coordinates 
{x;} and their conjugate momenta {p;}. Suppose A and B are any two dy- 
namical variables — that is, they are characteristics of the system depending 
on the x’s and the p’s. Examples of dynamical variables include the angular 
momentum, L=2#x p, or the kinetic energy, ae p;/2m. Then the Poisson 
Bracket of A and B is defined by, 

“\{@A8B 0AOB 
(A, Bhon= of A o0 eh. (5) 
tj; Op; — Op; Oxy 


j=l 


Poisson Brackets are introduced into classical mechanics because of the re- 
markably simple form that Hamilton’s equations take when expressed in 
terms of them, 


: OH 

tj = 05, en 

. OH 

py = Oa 7 eee (6) 


as can easily be verified by using the definition of the PB on the dynamical 
variables x;, pj, and H. The time development of an arbitrary dynamical 


OR. L. Jaffe © MIT Quantum Theory Notes 6 


variable can also be written simply in terms of Poission Brackets. For sim- 
plicity we consider dynamical variables that do not depend explicitly on the 
time.+ Then 


. dA oA aA 
A=— = y 2; +p; 
dt : ‘ze “a e =} 
j=l 


= {A, H}pp (7) 


where the second line follows from the first by substituting from (6) for 
and p. 

Finally, to complete the analogy, note that the Poisson Brackets of the 
x’s and the p’s themselves are remarkably simple, 


{ete} = 0 
{Pj, Pe} = 0 
{xj,Pr} = jk (8) 


because Ox; /Op;, = 0, Ox; /OX;, = dj, ete. 

Now we can step back and compare the Poisson Bracket formulation 
of classical mechanics with the operator equations of motion of quantum 
mechanics. Compare (1) to (8), (2) to (7) and (3) to (6). It appears that a 
classical Hamiltonian theory can be transcribed into quantum mechanics by 
the simple rule, 


(A Bien = lA, Bl. (9) 


where the quantum operators A and B are the same functions of the operators 
z; and p; as A and B are of x; and p,. 

This remarkable rule tells us how to guess the quantum theory corre- 
sponding to a given classical dynamical system. The procedure is called 
“canonical quantization” because it follows from the canonical Hamiltonian 
description of the classical dynamics. In fact there are some important limi- 
tations of the canonical quantization method that will be discussed in a later 
subsection. First, however, let’s summarize (and appreciate the elegance 


'A dynamical variable may or may not depend ezplicitly upon the time. Any dynamical 
variable will depend implicitly on the time through the variables x; and p;. Explicit time 
dependence arises when some agent external to the system varies explicitly with the time. 
An example is the time dependence of the magnetic interaction energy, —ji - B (t), when 
an external magnetic field depends on time. 


©R. L. Jaffe MIT Quantum Theory Notes t 


of) the simple steps necessary to find the quantum equivalent of a classical 
Hamiltonian system — 


e Set up the classical Hamiltonian dynamics in terms of canonical coor- 
dinates {x;} and momenta {p;}, with a Hamiltonian H. 


e Write the equations of motion in Poisson Bracket form. 


e Reinterpret the classical dynamical variables as quantum operators in 
a Hilbert space of states. The commutation properties of the quantum 
operators are determined by the rule (9). 


Of course we cannot forget the difference between quantum and classical 
mechanics: Although the fundamental equations of motion can be placed in 
correspondence by the canonical quantization procedure, the different inter- 
pretation of classical and quantum variables leads to totally different pictures 
of phenomena. 


2.2 Simple Examples 


Here are some simple examples of the canonical quantization procedure. 
Later we will encounter a very important and non-trivial example in the 
problem of a charged particle moving in a magnetic field. 


2.2.1 Bead on a Wire 


Suppose a rigid wire is laid out in space along a curve x (s). We parameterize 
the wire by a single coordinate s which measures length along the wire. Let a 
bead slide without friction along the wire. This is a standard (easy) problem 
in Lagrangian mechanics. The energy of the bead is entirely kinetic and is 
given by smu" = sms”, because the bead is restricted to move only along 
the curve. So the Lagrangian is L = sms; the momentum conjugate to s is 
p = OL/0s = m5; the Hamiltonian is H = p?/2m; and the quantum theory 
is defined by the operators 8, p and H = p?/2m. In short the bead behaves 
like a free particle on a line. It experiences no forces due to the curving of 
the wire. 

We can dress up this problem a little by adding gravity. Suppose the wire 
is placed in a constant gravitational field g = —gy. Now there is a potential 
energy V(s) = mgy(s). The canonical operators are still § and p, but now 
the Hamiltonian is H = p?/2m + mgy(8). 


©R. L. Jaffe MIT Quantum Theory Notes 8 


This is actually so oversimplified a problem that interesting physics has 
been lost. A real bead is held on a real wire by some force that prevents 
it moving in the directions transverse to the wire. A simple model of this 
would be to replace the wire by a tube whose center follows the track of the 
wire, but has a finite (say constant circular) cross section. Then the particle 
would be free to move in the tube, but unable to leave it. This is a more 
complicated problem, but it can be analyzed pretty simply in the limit that 
the transverse dimension of the tube, d, is small compared to the radius of 
curvature, R(s), of the wire2 2 The end result is the appearance of a new 
term in H proportional to the inverse of the squared radius of curvature, 


2 

D 1 

= — — —____.. 10 
2m 4R(s)? a) 

So a real bead on a wire feels a force that attracts it to the regions where the 

wire is most curved (R(s) is smallest). 


2.2.2 Relative and Center of Mass Coordinates 


Consider two particles moving in three dimensions and interacting with one 
another by a force that depends only on their relative separation. The clas- 
sical Lagrangian is 


1 #2 1 2,2 = y 
L= gan TF ymare — V(r, — 72). (11) 


We could quantize this canonically in this form and obtain a two particle 
Schroedinger equation. Instead let’s make the transformation to relative and 
center of mass coordinates at the classical level and quantize from there. We 
define 


, = ne) 


ell 
I| 


ride + MP2) 
(12) 


2The radius of curvature at some point X(s) is the radius of a circular disk that is 
adjusted to best approximate the wire at the point xX. 

3This problem has attracted attention recently in connection with the propagation of 
electrons in “quantum wires”. If you’re interested I can supply references. 


©R. L. Jaffe MIT Quantum Theory Notes 9 


where M = m1 + mz is the total mass of the system. Substituting into (11) 
we find 


ieee 1: 
L= sR + shire =Vi?r). (13) 


where y is the reduced mass of the two body system. Now is it easy to 


read off P, the momentum canonically conjugate to R, P= mR, and p, 
the momentum canonically conjugate to r, p = pr. The transition to quan- 
tum mechanics is accomplished by postulating the canonical commutators, 
[R;, Px] = ihd;p, (7), Pe] = ihdjn, etc. The Hamiltonian is 

Ha Pa LA (14) 

IM | OP 

So just like classical mechanics, the center of mass moves like a free particle, 
and the interesting part of the dynamics is just like a single particle of mass 
fs moving in a potential V(r). 

These examples may seem overly elementary. If this were all canonical 
quantization was good for, it would not be necessary for us to spend much 
time on it. Moreover there are many mistakes to be made by applying the 
canonical method too naively (as we shall see below). In fact, canonical 
quantization helps us guess the quantum equivalent of some highly non-trivial 
classical systems like charged particles moving in electromagnetic fields, and 
the dynamics of the electromagnetic field itself. 


2.3. Warnings 


The canonical quantization method is not a derivation of quantum mechanics 
from classical mechanics. The substitution (9) cannot be motivated within 
classical mechanics. It represents a guess, or a leap of the imagination, 
forced on us by the bizarre phenomena that were observed by the early 
atomic physicists and that were inexplicable within the confines of classical 
mechanics. The canonical quantization method is simply a recognition that 
the quantum mechanics of a single particle that was developed from wave 
mechanics is in fact a representative of a class of systems — those described 
by traditional Hamiltonian mechanics — that all can be quantized by the 
same methods. 

Many systems we are interested in quantizing differ from this norm. The 
attitude I would like to advocate is that we use canonical quantization as 
the first step toward a quantum equivalent of a classical theory, but that 


OR. L. Jaffe MIT Quantum Theory Notes 10 


we remain open minded about the need to augment or refine the quantum 
theory if phenomena force us. Some of the problems that arise on the road 
from classical to quantum mechanics are listed below in order of increasing 
severity (in my opinion). 


2.3.1. Quantum variables without classical analogues 


There are some systems that possesses quantum degrees of freedom that — 
for one reason or another — do not persist in the classical regime. The best 
known example is spin. When the quantum states of electrons were studied 
in the 1920’s, it was soon discovered that the electron possesses other de- 
grees of freedom that classical point particles don’t have. Using guesswork 
and experimental information, physicists invented operators that describe the 
behavior of this innately quantum mechanical variable. Nature has forced 
us to postulate new quantum variables to augment the classical description 
of a system. This does not represent a failure of quantum mechanics. Quite 
the opposite — one of its great strengths is that phenomena without clas- 
sical analog can be introduced relatively easily, without upsetting the basic 
framework of the theory. 

Many great advances of 20th century physics (spin, color, internal symme- 
tries, etc.) fall into the category of discovering and understanding quantum 
variables without classical analog. Physicists relish the possibility of such 
radical departures from classical dynamics. 


2.3.2 Operator ordering ambiguities 


Classical dynamical variables commute with one another, so the order in 
which they are written does not affect the dynamics. Not so in quantum 
m €2& 4 


mechanics. Suppose, our Lagrangian was ns Then the classical Hamil- 


tonian would have been H = aah. When p and € become operators, how 
is this to be interpreted? Is it ap gt or Dep? or some other variant. Us- 
ing [€,p] = ih it is easy to see that the different variants yield different 
Schroedinger wave equations and therefore different physics. 

This problem is called an “operator ordering ambiguity”. More physics 
input is required to eliminate the ambiguity. Sometimes general principles 


4This is the lagrangian of a point mass on a string wrapping around a spool of radius 
R. The coordinate € is the length of string (assumed straight) not wound up. 


©R. L. Jaffe © MIT Quantum Theory Notes 11 


help: a Hamiltonian must be hermitian in order that probability be con- 
served. If the ambiguous term xp (note xp 4 px) occurred in a Hamiltonian, 
we could rather confidently replace it by the hermitian form $(xp + px). 
Sometimes, hermiticity is not enough. General conservation laws, like con- 
servation of momentum or angular momentum help. If all else fails, it is 
necessary to leave the ambiguity (parameterized by the relative strength of 
different hermitian combinations) and see which best describes experiment. 
In practice I am not aware of physically important examples where hermitic- 
ity and conservation laws fail to resolve operator ordering ambiguities. 


2.3.3 Singular points 


The canonical quantization method becomes complicated and subtle when 
one tries to apply it to coordinate systems that include singular points. A 
familiar example is spherical polar coordinates (r, 6, and ¢). The origin, 
r = 0, is a singular point for spherical polar coordinates — for example, 0 
and ¢ are not defined at r = 0. If you follow the canonical formalism through 
from Lagrangian to canonical momenta (p,, pg, and pg), to Hamiltonian, to 
canonical commutators, a host of difficulties arise. Although it is possible to 
sort them out by insisting that the canonical momenta be hermitian opera- 
tors, it is considerably easier to quantize the system in Cartesian coordinates 
and make the change to spherical polar coordinates at the quantum level. 
This is the path taken in most elementary treatments of quantum mechanics 
in three dimensions: the operator p* = pj + ps + p} is recognized as the 
Laplacian in coordinate representation (pj; @ —ihd/a; => pi? @ —h2V?) 
and the transformation to polar coordinates is made by writing the Lapla- 
cian and the wavefunction in terms of r, 6, and ¢. As a rule of thumb: 
the canonical approach becomes cumbersome when the classical coordinates 
and/or momenta do not range over the full interval from —co to +00. 


2.3.4 Constrained systems 


Finally we must at least mention a complicated and rich variation of the 
canonical quantization method that has become an important focus for re- 
search in recent years. Sometimes the degrees of freedom of complicated 
systems are not all independent. For example, a particle may be constrained 
to move on a specified surface (in three dimensions). Then the coordinates 
and velocities that appear in the Lagrangian cannot be regarded as indepen- 


©R. L. Jaffe MIT Quantum Theory Notes 12 


dent variables. The changes in x, y and z must be correlated so that the 
particle remains on the surface. The canonical formalism can break down 
in several (related ways). Sometimes one (or more) of the canonical mo- 
menta is identically zero. If p, = 0, then the associated Hamilton’s equation, 
OH/Oq, = 0 is not an equation of motion. Instead it is a constraint that 
must be satisfied by the canonical coordinates and momenta at each time. 
The constraint may not be consistent with canonical commutation relations. 
A simple, but not particularly interesting example, would be the constraint 
x+y = 0 imposed on motion in two dimensions. The constraint is not consis- 
tent with the canonical commutator [x, py] = 0 and [y, p,] = ih because the 
commutators can be added to give [x + y,p,| = ih. This case is not serious 
because we could return to the original lagrangian, use the constraint to elim- 
inate one dynamical variable from the problem, and then proceed without 
difficulty. In this case, we would write € =x —y and7n =2x+~y and use the 
constraint to eliminate 7 from the problem. In more complicated cases it is 
not possible to remove the constraint in this fashion, either because it is too 
hard to solve the constraint equations for one or more variables, or because 
the problem has some deep underlying symmetry that would be broken by 
choosing to solve for and eliminate one variable as opposed to another. Dirac 
realized the importance of such problems and developed a method to handle 
quantization under constraints. Other powerful yet practical methods were 
developed by L. D. Faddeev and V. N. Popov in the 1960’s. Quantum ver- 
sions of electrodynamics, chromodynamics (the theory of the interactions of 
quarks and gluons) and gravity all make use of these modern extensions of 
the idea of canonical quantization. 


3 Motion in a Constant Magnetic Field — 
“Landau Levels” 


3.1 Introduction 


A pretty and relatively simple application of canonical methods in quantum 
theory is to the motion of a charged particle in a constant magnetic field. 
This problem was first solved by the great Russian theoretical physicist, Lev 
Landau. In recent years condensed matter physicists have found interesting 
applications of Landau’s problem to real physical systems. 

Suppose a magnetic field Bo: constant in magnitude, direction, and time 


OR. L. Jaffe © MIT Quantum Theory Notes 13 


fills a region of space. For definiteness we assume Bo points in the é3 direction. 
All points in the xy-plane are equivalent — a simple example of translation 
invariance. The classical motion of a charged particle in a constant magnetic 
field is determined by the Lorentz force law, 


mz = eB + <a x B. (15) 
c 
When E = 0 the force always acts at right angles to the velocity, so kinetic 


energy is conserved. For simplicity, we restrict the motion to the xy-plane. 
Then the particle moves in a circle at a constant angular velocity, 


S 
SS 
II 


(16) 


wy is known as the Larmor frequency. (You can also find it called the cy- 
clotron frequency.) 

Quantizing this system requires all the apparatus of the canonical quan- 
tization method we developed earlier in these notes. To analyze the problem 
quantum mechanically we must first find the classical Hamiltonian that de- 
scribes the system. Then we will be able to go over to the quantum domain. 
This requires us to introduce the concept of a vector potential, A, which 
determines the magnetic field by its curl, 


B=VxA, (17) 


in somewhat the same way that the electrostatic potential, ®, determines the 
electric field, E = —V®. Once we have the Hamiltonian, we will solve the 
quantum problem, illustrate some of the subtleties, and apply what we have 
learned to the Aharonov-Bohm and Integer Quantum Hall Effects. A recur- 
ring theme in much of this study is how the system reflects the underlying 
translation invariance in the xy-plane. The magnetic field is independent of 
x and y. The Hamiltonian does not respect this symmetry, however, because 
the vector potential depends on x and y in an asymmetric manner. In the 
end the physics must be translation invariant but it will take some work to 
demonstrate this. 


©R. L. Jaffe MIT Quantum Theory Notes 14 


3.2 Lagrangian, Hamiltonian and Canonical Quantiza- 
tion 


A velocity-dependent force like (15) requires a velocity-dependent Lagrangian. 
A detailed derivation of the Lagrangian is more appropriately the subject of 
a course in electrodynamics. Here I will tell you the answer and show that the 
Euler-Lagrange equations reduce to the Lorentz force law. The Lagrangian 
is 


> 


: ies oe 
LZ, #) = sin + <1: A(Z). (18) 
First, the momentum canonically conjugate to x; is, 


OL e . 


Notice that the momentum in this problem is not mv. Next, the rest of the 
Euler-Lagrange equations, 


OES i Oe 
Or; co 8e, 
3 
3 = e€ : OA; OA; 
mee GY ~ a) 


Notice that A depends on the particle’s position Z(t) so that when we dif- 
ferentiate by t we obtain A; = ye ey £,0A,/Ox;, by the chain rule. The two 
dA, OA; 


terms involving derivatives of A combine to form its curl, 5+ — i. = Ejk Dy. 
J 


So (20) reproduces the Lorentz force law correctly provided the curl of A 
gives the desired magnetic field as in (17). | 
Next form the Hamiltonian, H(z, p) = #-p— L, with the result, 
iD es i € = 
H = =mi” = —(p— -A)’. 21 

sini? = = (p- £4) (21) 
Notice that the energy is merely smu" — since the magnetic field acts at 
right angles to the particle’s velocity it does not contribute to the energy. 
However the Hamiltonian must be regarded as a function of the canonical 
coordinates and momenta, and p as in the last step of (21). 


OR. L. Jaffe MIT Quantum Theory Notes 15 


To quantize, we postulate canonical commutation relations, 


[x;, Px] = thd;x 
[7j,%%] = [p;, Px] = 9. 
(22) 


Next we must prescribe a vector potential corresponding to Boy = Boé3. We 
choose 


+ B - “s 
A= = (1€2 _ X2€1) : (23) 


It is easy to verify that VxA= Boé3, however other choices such as A= 
Box 2 would do just as well. They all describe the same magnetic field. In 
classical mechanics we know that physics depends only on B. The same is 
true here in the Landau problem, however at this moment we have only the 
Hamiltonian which appears to depend specifically on A. 

Substituting the explicit choice of A into H, we find 


1 eBo 2 eBo D) 1 9 
GS ae vou rece’) fe, 
pemc Toe rz)” + (pe oe Bi) pop De 
1 1 
= ay Pt +p, + ps) + zi (x} + 22)—wls (24) 


where w = Sw, = eBo/2mc, and L3 = x1 p2 — Xp; is the angular momentum 
in the 7; — 2X2 plane. So the system looks like a particle in a two-dimensional 
harmonic oscillator with an additional potential —wl3. Having solved the 
harmonic oscillator before, we can easily construct the energy eigenstates for 
this problem. 


3.3. A solution to the quantum equations of motion 


First consider conservation laws: Apparently angular momentum about the 
€3 — axis is conserved, |L3,H] = 0. So is momentum in the é3 direction, 
[p3, H] = 0. First, we dispose of the p3 dependence by restricting the problem 
to motion in the x,—2 plane. Less formally we could equally well diagonalize 
p3, labeled our eigenstates by its eigenvalue, k3, and add (1/2m)k3 to our 
eigenenergies. 

The other components of momentum, however, are not conserved: 


[pj;, H] A 0 (25) 


OR. L. Jaffe MIT Quantum Theory Notes 16 


for 7 = 1,2. This comes as a surprise: since the magnetic field is uniform in 
space we would expect the system to be translation invariant, and therefore 
to find momentum conserved. On the other hand, the classical motion is 
circular, so perhaps we should not be surprised that the usual concept of 
momentum has to be amended. Resurrecting momentum conservation will 
be a principal task in the following analysis. 

To classify the eigenstates of (24) we introduce standard harmonic os- 
cillator “creation” and “annihilation” operators (in the following we replace 
the coordinates x and y by x; and 29), 


h — fhmw 
Tk=/ mu (te - a\.) Pr =—14/ 5 (ay — al), (26) 


and a, = iCal + ida2), with the usual commutation relations, 


las, a} = 07s 
[a;, ax] (ate Gs. lees tla 0 
las, al, =), 


(27) 
Substituting into H and L3 we obtain 
[3 = fh (ala, _ ala) F 
H = hw (ala, +ala_+ 1) — hw (ala, = aa_) 
1 
= fer (cha. Sd 5) ; 
(28) 


which has a straightforward, though unusual interpretation. Eigenstates of 
H and Lz are labeled by the number of quanta of + and — excitation, 


Ni = ala 


N4|n4, n_) = n4|n4,n-) 


(29) 


+ quanta carry +1 unit of angular momentum, but only n_- quanta con- 
tribute to the energy. So the energy eigenstates, known as Landau levels, are 


OR. L. Jaffe MIT Quantum Theory Notes 17 


each infinitely degenerate. After exploring some of the properties of Landau 
levels we will return and try to sort out this degeneracy. 
To summarize: 


e There are two physical conserved quantities, the energy FE and z- 
component of angular momentum, L. 


e Equivalently, the number of “oscillator quanta”, n, and n_ are con- 
served, with 


= E(n4,n_) = hw, (n— ag a) and 
— L(ny,n_) = fi(ny —n_). 


e F is independent of ni, and n_ can take on any non-negative integer 
value, so each Landau energy level is infinitely degenerate. 


e For energy (n_ +3) hw t the tower of degenerate states begins at angular 
momentum —n_ and grows in steps of fA to infinity. 


3.4 Physical Interpretation of Landau Levels 


The results of the last section are as puzzling as they are enlightening. Clas- 
sically a particle moving in the x,-x2y plane under the influence of a constant 
magnetic field, Byé3, can have any energy and its (circular) orbit can be cen- 
tered at any point (21, y,). In the quantum world the states are gathered 
into discrete energy levels separated by hw;,. Each level is vastly degener- 
ate. The Landau Hamiltonian, eq. (24), on the other hand, singles out a 
specific origin of coordinates about which the harmonic oscillator potential 
is centered. In this section we explore the meaning of the Landau degeneracy 
and (eventually) explain how the translational invariance so obvious at the 
classical level, is manifest in the quantum theory. 


3.4.1 The location and size of Landau levels 


Despite superficial appearances each energy eigenstate found in the previous 
section can be placed wherever we wish in the x;-x,; plane. The simplest 
way to see this uses the coherent state formalism that was developed in 8.05. 
Since the energy depends only on n_ we can superpose states with different 
n, without leaving a fixed energy level. Define, then, the coherent state 


la) = exp aa, |0, 0) (30) 


OR. L. Jaffe MIT Quantum Theory Notes 18 


where |0, 0) is the eigenstate of eq. (24) with nz = n_ = 0, and a is an arbi- 
trary complex number. Eq. (30) creates a coherent state in the + oscillator 
but leaves n_ = 0 untouched. Using eq. (26) and eq. (27) it is easy to show 
that 


(11) = Rea 
(x) Lo Im Qa, (31) 


where f) = ,/h/mw, so by choosing the real and imaginary parts of a = 
(a1 —ix2)/eo we can center a state with energy Eo = $hw, wherever we wish. 

Momentum and velocity are not directly proportional in this problem. 
The state |@) provides a graphic example: The quantum equation of motion 
for x, tells us, 


. 1 
Lk = pte A (32) 


so it is easy to show that the expectation value of the velocity is zero in the 
state |a): 


(altxla) = =(allvx, Hla) 
= = (alan — Exgla) = 0. (33) 


so the state centered at (1,22) does not wander away with time. On the 
other hand the components of the momenta p;, and pz do not vanish in the 
state |a), 


] 
| 

B 

Q 


(p1) 


] 
| 
ae) 
oO 
& 


(p2) (34) 
but this has no direct physical interpretation because the momenta not sim- 
ply mv.2 

In contrast to its position, the size of a Landau level is directly connected 
to its energy and to the strength of the external magnetic field. To see this 
we make a semiclassical estimate of the area, A(n_). Since v/r = wr, and 


°We shall see in lecture that (pz) is not gauge invariant, and therefore cannot correspond 
to a physical observable. (mux) = (py — (e/c) Ax) is gauge invariant. KR. 


OR. L. Jaffe MIT Quantum Theory Notes 19 


classically E = $mv?, we have r? = E/2mw7. Now the quantum theory 
requires (28), so r? = (R/mwy) (n_ +4). This becomes more transparent if 
we multiply by 7Bp to form the flur through the orbit, ®;(n_) = mr?Bo, 


and substitute for wy, 
he 1 
P,(n_-) = —(n- + 5) (35) 
It appears that the flux through the particle’s orbit comes in units of a fun- 
damental quantum unit of flux, ®y = hc/e! In the next section, we shall see 


that something very close to this characterizes the full quantum treatment. 


3.4.2. A more careful look at translation invariance 


In this section we take a more sophisticated approach to the Landau problem 
that will clarify both the degeneracy of the Landau levels and the way in 
which the system manages to respect homogeneity in the xy-plane. We begin 
with a canonical transformation, trading x; and p,; for new variables, 


(36) 


It is easy to check that the Landau Hamiltonian, (21), can be written in 
terms of II and ¢ alone, 


1 io, ™ 249 

H= on + au? , (37) 
with [@, II] = ih. Once again, we have a harmonic oscillator, with eigenener- 

gies E, = (n+ $)hwry. 
The degeneracy formerly associated with n_ is now connected with the 
P, and P . Since [P,,¢] = [|Px, H] = 0 we see that P, and P, are candidate 
constants of the motion. To see what symmetry they generate consider the 
commutators of P, with x;, the spatial coordinates. A brief calculation yields 


So the new “momenta”, P,, generate translations of the coordinates in the 
usual sense. Thus the symmetry associated with the constants of the motion 


OR. L. Jaffe MIT Quantum Theory Notes 20 


(and the degeneracy of the Landau levels) is translation invariance. This is 
a welcome result. However, we cannot simply diagonalize P; and P, along 
with H because they do not commute with each other, 


Since the commutator of P, and Py» is a cnumber, perhaps we can con- 
struct functions of them which do commute. The simplest possibility is to 
look at finite translations of the form, 


-Pyb 
ive 


T,(b;) =; te A 
Ty(bo) = ete 
(40) 
According to our study of translations, these operators should translate 71 
and x» by b; and by respectively. Indeed, it is easy to show that on account 
of (38), 
Ti (b1)'a1Ti (b1) = 14+b 
T, (b1)'22Th (b1) = wr 


(41) 
and so on. 

The crucial question is whether we can choose b; and by such that the 
finite translations commute, [71 (bi), 72(b2)| = 0. First, let’s determine the 
effect of J, (b1) on Po, 

fo(b1) = T, (b1)' PoT; (b1) 
d —i 
Be = 7 (b4)" Pi, PATO) 
= —2mw, so 
fo(bi) = fo(0) — 2mwd, 
= Py—2mwbh,. 
(42) 


This enables us to apply the operator 7(b;) to 7(b2), 
Ti} (b1)To(be)Ti(ri) =e *me/PT, (bo), or 
Ta(b2)Ti(b1) =e MOreme/hT (b,) To (bo). 
(43) 


©R. L. Jaffe MIT Quantum Theory Notes 21 


So the finite translations fail to commute only by virtue of this multiplicative 
factor of unit magnitude. If (and only if) we choose the parameters b; and 
by so that the phase is a multiple of 27 then the translations commute. This 
condition is 


2muwbybo/h = 2rN (44) 


for an integer N. This defines a rectangle of area bb) in the xy-plane. Let 
us find the flux through this rectangle, 


(bj, by) — b,byBo ra a — N®o (45) 
Mua 


So we have established a maximal set of commuting operators for the 
Landau problem: H, 7(b,), and 72(b2), where b; and by obey (45) completely 
characterize the states of a charged particle in a constant magnetic field. 

Let’s take some time to interpret this result... 


Translations in the n,,n_ basis Now that we know that 7,(b,) and 
T,(bz), commute with H when b = (6, bs) satisfies (45), we can use them to 
translate the states we found in §3.3 around the xy-plane. The motivation 
for this is to understand the degeneracy of the Landau levels associated with 
the n_ quantum number. Since the state with n, = n_ = 0 is localized 
at the origin, we anticipate that the translation by b will produce a state 
localized around b. 

Consider, then, the state with n, = n_ = 0 — a harmonic oscillator 
ground state centered at Z = 0. Now translate this state to 6, where (bi, b2) 
satisfies the condition (45) with N = 1, so they represent the smallest trans- 
lation that commutes with the Hamiltonian, 


|b, 0,0) = T(b1)To(b2)|0, 0). (46) 


This state is normalized to unity because the operators 7 are unitary. It has 
energy E = $hw, and (Z) = b. What does it look like in terms of the original 
basis |n,,n_)? To answer this we must express the translations in terms of 
the {a,}. Using the Baker-Hausdorf Theorem, a short calculation gives: 


4 1 |b|2 “al 
|b, 0, 0) = exp ella exp : ae |0, 0), (47) 
oP G, 


where b = b; —ibg and ly) = ,/h/mw is the natural scale of lengths associated 
with the Landau problem. This is an example of a “coherent state”, as we 


©R. L. Jaffe MIT Quantum Theory Notes 22 


discussed in §2, superposing an infinite series of degenerate states with differ- 
ing values of n,. Note that n_ = 0 is preserved so, as promised, translation 
did not change the energy of the state. The prefactor is of particular interest 
because it determines the overlap of the translated state with the original 


state, |0,0), care 
1 t+ 95 
2by bs ) 


The exponential is bounded by e~**. Thus the overlap of the original state 
and its translation to the next “cell” is very small. The same analysis applies 
to any of the eigenstates |0,n_). We can translate each of these energy eigen- 
states to any point on rectangular lattice (n1b,, n2b2) throughout the plane. 
Although these states are not orthonormal, they they give us a qualitatively 
correct picture of the solutions of the Landau problem as towers of nearly 
localized energy eigenstates with E = (n + $)hw, situated in unit cells on a 
grid labeled by any pair of distances 6; and by satisfying (45). 


|(0, O|b, 0, 0) |? = exp (-2n (48) 


2 


Eigenstates of the finite translations Another, more conventional, ap- 
proach is to study eigenstates of our maximal set of commuting operators, 
H, T,(b1), and 73(b2). First we choose some b satisfying (45). Since J; is 
a unitary operator its eigenvalues are complex numbers of unit magnitude, 
which we parameterize by 


T, (b1) |o1, 62, 2) e'*'1b1, ba, n) 
T(b2)|1, 2, n) = e'? |, 2, n) 


Hldr.d2,m) = (n+ 5)hor|d1, dan) 


(49) 


It is easy to see that the phase ¢; must be linear in b; (T(2b1) = [Ti(b1)]?), 
so we define ¢; = k;b;, where k; is a real number in range —7/b; < k; < 1/b; 
because the phase ¢ is only defined modulo 27. This makes these states look 
very similar to plane waves even though they are not. If we construct the 
coordinate space wavefunction corresponding to |¢1, ¢2, 7), 


Why kon (1, 2) = (£1, L2\h1, 62, N) (50) 


then it is easy to see that the consequence of (49) is that 


Why kon(L1 + 61, Lo + by) = ett thabagy, (21, £2) (51) 


OR. L. Jaffe MIT Quantum Theory Notes 23 


just like a plane wave, expi(k,x, + kox2), would behave. The difference, of 
course is that w has this simple behavior only for the special translations we 
have discovered, not for an arbitrary translation, and as a consequence, the 
“momenta” (k,, kz) are not conserved. 

States like these arise in many other situations where a system is invariant 
only under certain finite translations. The classic example is a crystal lattice 
which is invariant if we translate by the vectors that define a unit cell but 
not otherwise. In this way the Landau system resembles a two-dimensional 
crystal with a rectangular unit cell whose area is determined by the flux 
quantization condition (45). Wavefunctions that behave like (51) are known 
as Bloch waves in honor of Felix Bloch who first studied quantum mechanics 
in periodic structures. We will discuss them in detail when addressing the 
quantum theory of electrons in metals. 

Note that the states Wx, kon(@1, ¥2) extend everywhere throughout the zy- 
plane. This is clear from (51) since their amplitude arbitrary distances away 
from some original location is only modulated by a phase. Thus this basis is 
very different from the quasi-localized basis we obtained by translating the 
state |n4 = 0,n_ = 0) around the plane in the previous section. Of course 
they describe the same system and the same physics and are related through 
the marvelous power of the superposition principle. 


4 The Aharonov Bohm Effect 


The vector potential, A(2) makes a surprising appearance in the quantum 
description of a particle in a magnetic field. It all stems from the classical 
Hamiltonian, : 
=~ En? 
H==—(@- <A), (52) 
Despite the appearance of Ain H, we know that at the classical level, the 
dynamics depends only on E and B because only they appear in Newton’s 
Laws, eq. (15). In the classical domain A and the electrostatic potential, @ 
can be regarded as merely useful, but inessential, abstractions. 
In the quantum theory H rather than mz is fundamental, so the possibil- 
ity exists that physics depends on A. For the case we have studied in detail 
— motion in a constant magnetic field — A = st x B so we cannot even 


define dependence on A independent of B. 


©R. L. Jaffe MIT Quantum Theory Notes 24 


In 1959 Y. Aharonov and D. Bohm proposed a way to observe a direct 
effect of A and established the quantum significance of As Although this may 
seem like a somewhat technical detail, it captures some of the unusual aspects 
of “reality” in the quantum world and has fascinated students ever since. 
Perhaps more important, vector potentials associated with generalizations of 
electromagnetism play a central role in our extraordinarily successful theories 
of subnuclear particle physics. In that arena observable consequences of the 
vector potentials abound. 


C; 


Figure 1: Paths for analysis of Aharonov-Bohm effect. 


Aharonov and Bohm proposed to consider motion of a charged particle in 
the plane perpendicular to an idealized solenoid that produces a constant 
magnetic field, B= Boé3, but only within a circle of radius R. For r > R, 
B can be taken to vanish identically. This configuration is shown in Fig. 1 
along with a couple of paths that will figure in the discussion. Even though 
B =0 forr > R, A cannot vanish in this region because of Stokes’ theorem. 
Consider the integral of the vector potential around the circle marked C' 
(which is the boundary of a disk S$) of radius r shown in the figure. Then 


®y, Aharonov and D. Bohm, Phys. Rev. 115 (1959) 485. 


OR. L. Jaffe © MIT Quantum Theory Notes 25 


Stokes’ theorem and the defining relation for A,B=VxA, give 


baa z [[ese¥xa 
C 


/ as 3 : B = mr? Bo, (53) 


so A cannot vanish everywhere on the circle C’. In fact symmetry requires 
that A point in the azimuthal, ¢, direction, so an elementary calculation 
gives, 


BS ae 
eons for r>R (54) 


Of course the existence of a vector potential in the region outside R, where 
B= 0, is a classical phenomenon no more surprising than the existence of 
an electrostatic potential inside a uniformly charged sphere where E=0. 
The question of interest is whether some physical phenomenon that takes 
place entirely in the region r > R, where B = 0 can depend on A. To study 
this, consider a charged particle described by a wave packet that moves on 
either of the two paths marked C; and C4 in the figure. Of course a quantum 
particle cannot be constrained to a definite path, but we will see that the 
effect is the same along all paths close to C; or Co, so a diffuse path of the 
sort allowed in quantum mechanics will give the same result. 

The particle’s propagation through the vector potential is determined by 
the time dependent Schrodinger equation, 


50 £AP ut) = inven), (55) 
where p = —ihV. Define the Aharonov-Bohm phase factor, 

gcy=— f dA (56) 
where the line integral begins at the point P,, follows curve C' and ends at a 


point 7. Note that 


> 


Val", C) = = A(r) (57) 


independent of the curve C’. Now factor the phase g out of the wavefunction, 


wr, t) =. exp(ig) Xe, t), (58) 


OR. L. Jaffe MIT Quantum Theory Notes 26 


and substitute into eq. (55). The result is that x obeys the free Schrédinger 


equation, 
2 


h =) +z 

Thus all information about the vector potential is contained in the phase 
that multiplies x. 

With the from of eq. (58) in mind let us compare the phase accumulated 
by a well-localized charged particle wave packet that begins at point P, 
and propagates to P, along either path C) or Cy. We start with W(7r,t,) 
concentrated at P,; at t = t,. Let us suppose, however, that w is a quantum 
mechanical superposition of two terms, W(7,t,) = W1(7,t1) + We(7, t1) such 
that w1(7,t1) and w2(7,t1) describe wave packets which subsequently (i.e. 
after t,) follow the two different paths C, and C2. At time ta, the two wave 
packets both reach P:, and 


C 


w(F, tg) = exp = i: iid x1 (7, t2) + exp = | fi. x2(F,t2) (60) 
hi C1 hc Co 


with both x1(7,t2) and x2(7,t2) localized in the vicinity of the point P». 
Whatever other (A-independent) relative phase may have accumulated by 
the time the wave packets have reached P,, there is an A-dependent relative 
phase, 


U(r, te) = exp = | dl - Al pale to) + exp =z ¢ dl - Al xo(7, t)}. 
Re Jo, he Je 
(61) 
Note that the relative phase is given by the loop integral over the closed path 
C = Co = C1. 

The relative phase in eq. (61) is measureable, for example by watching 
the interference pattern on a detector at P: as the magnetic field, Bis slowly 
changed. The phase depends only on the loop integral of A, which in turn 
depends only on the total magnetic field enclosed within the path C, 

e ar Sg 
Ree dl- A 
et R? Bo/hc = e®/hc (62) 


g(C) 


where 6 = 7R? By is the magnetic flux contained within C. 


OR. L. Jaffe MIT Quantum Theory Notes ai 


We have been fairly careful in this discussion to make it clear that the 
particle moves entirely in a region where B = 0, so the only source of the 
phase is the vector potential A. The result does not depend on the details 
of the path followed by the particle. For example, if we replace the path 
Cy by a nearby path C4, on the same side of the solenoid, then the resulting 
phase, g(C’), where C’ = C4, — C; is unchanged because g(C’) depends only 
on the magnetic flux enclosed by C’, which is the same as that enclosed by 
C. Thus it does not matter that the quantum particle cannot follow a sharp 
trajectory. The Aharonov-Bohm phase is a global property of the motion, not 
a property of the particle’s exact path. A similar argument shows that g(C) 
does not depend on the gauge we choose to describe the vector potential. If 


we change gauge, from A to A’, with 


then 
HC) > g(C)=90)- Ff d-VA= 90) (63) 
CSC 
because the integral of the gradient of any continuous function around a 
closed path is zero. 

So Aharonov and Bohm have shown in this simple example, that the vec- 
tor potential has physical manifestations in quantum mechanics. Although 
the phase g(C) depends on the magnetic flux enclosed by the path, C, the 
path itself lies entirely in a region of space in which B=OandA #0; 

To quote Griffiths, page 349, “What are we to make of the Aharonoy- 
Bohm effect? Evidently our classical preconceptions are simply mistaken. 
There can be electromagnetic effects in regions where the fields [B and E] 
are zero. Note, however, that this does not make A itself measurable — 
only the enclosed fluz comes into the final answer, and the theory remains 
gauge invariant.” You should read Griffiths, section 10.2.4, but should for 
now ignore the connection to Berry’s phase. 

The Aharonov-Bohm effect does not only manifest itself as shifts in inter- 
ference patterns. See pages 344-345 in Griffiths for a description of how the 
Aharonov-Bohm effect leads to shifts in the energy levels for a “bead on a 
loop of string” if the string is everywhere in a region with B = 0 but encircles 
a flux carring solenoid. 


OR. L. Jaffe MIT Quantum Theory Notes 28 


5 Integer Quantum Hall Effect 


The Hall Effect is an elementary electromagnetic phenomenon where a con- 
ducting strip carrying a current along its length develops a current across its 
width when placed in a magnetic field. The direction of the induced current 
is sensitive to the sign of the electric charge of mobile species in the material 
and can be used to show that conventional currents are carried by electrons 
(negative charge) and that certain semiconductors contain positive mobile 
charges. In 1980, von Klitzing discovered that the relation between the ex- 
ternal electric potential and the Hall current is quantized in strong magnetic 
fields — the conductance, Iya /V, comes in units of e?/h. Von Klitzing was 
awarded the 1985 Nobel Prize for his discovery of the Quantum Hall Effect 
(QHE). There has been a tremendous amount of work on this subject over 
the past 20 years. New effects — including the fractional QHE — have been 
discovered, and a decent treatment of the subject would fill a course. In 
Quantum Physics III I would like to explain the origins of the effect — as 
an extension of the Landau problem — under ideal circumstances. First I 
will review the ordinary Hall Effect (though I will not assume you have seen 
it before). Next I will solve an idealized quantum problem: the Schrédinger 
equation for electrons propagating in the xy-plane with a magnetic field nor- 
mal to the plane and an electric field in the y direction. This will lead 
to quantization of the Hall conductance provided we make some simplify- 
ing assumptions about the structure of the material in which the electrons 
propagate. As usual in condensed matter physics, after solving an idealized 
problem I will have to return to the real world of actual materials and ex- 
plain why the results of the idealized analysis survive unscathed. Much of 
my presentation relies heavily on the introductory sections of the review The 
Quantum Hall Effect, by R. E. Prange and S. M. Girvin (Springer-Verlag, 
Berlin, 1987). 


5.1 The ordinary Hall effect and the relevant variables 


First let us review the ordinary Hall effect. A strip of conductor lies in the 
zy-plane. A constant and uniform electric field, F’, points in the y-direction. 


A constant and uniform magnetic field, B, is oriented normal to the xy-plane. 
First consider the case where B = 0. Mobile charge carriers* with charge 


"Electrons, for our case, but to keep track of signs we consider the charge carriers 


OR. L. Jaffe MIT Quantum Theory Notes 29 


q accelerate in response to E, but suffer random, redirecting collisions with 
ions. An elementary argument (presented in 8.02?) leads to the conclusion 
that the electrons develop a drift velocity v = gEt9/m, where 7 is the 
average time between collisions. These drifting charges generate a current 
density (charge per unit time per unit length in the zry-plane), 7 = qni, 
where n is the density of charge carriers (per unit area). The result is a 
current density linearly proportional to the impressed electric field, 

nq?to = 


E. (64) 


j = 

m 

The constant of proportionality relating j and E is the conductivity, and has 

units (in two dimensions) [j|/[E] = ¢/t =[velocity]. This is simply Ohm’s 

law with conductivity, 79 = nq?T9/m. It is useful to define the resistivity by 

the relation E = pj (which is the local analog of V = JR), in which case 

Po = 1/oo. It is also useful to think of the resistivity (and conductivity) as a 

matrix relating the vector 7 to the vector E. In this simple case, the matrix 

is diagonal, 
po O 

PENG oe (65) 

When the magnetic field is turned on, the mobile charges respond to the 

an “effective” electric field arising from the combined electric and magnetic 
fields according to the Lorentz force law, 


F = qe =qE +¢~— x B. (66) 
c 


The current comes from the charge carriers drift in response to E.g and is 
therefore given by 7 = oo Eg. Also, 0 = j/nq, so (66) can be rewritten as 
> > On - > 
j =ooE + —j x B. (67) 
nqc 
The current is no longer only parallel to E: Because of the second term in 
(67) it develops a component perpendicular to EL. This is most conveniently 


summarized in terms of a resistivity matrix, E = pj, which is no longer 
diagonal. From eq. (67) we have 


= 1 - 
Baa PSB ey. (68) 


arbitrary with charge q. 


©R. L. Jaffe © MIT Quantum Theory Notes 30 


We take the magnetic field B = —é; and obtain, 


B 
Po ~7 

p='p (69) 
mye £9 


Note that an electric field in the y-direction gives rise to a current density 
in the z-direction (and vice versa). This Hall current is easy to observe and 
depends on the sign of the charge carriers, because the off diagonal elements 
of the matrix p depend on the sign of g. In contrast, the normal resistivity 
depends only on q?. This is the stuff of undergraduate physics labs. This Hall 
resistivity describes the behavior of realistic conductors over a wide range of 
conditions. Surprisingly the behavior of a system of electrons described by 
Schroedinger’s equation subject to the same external fields is very different. 


5.2 Electrons in crossed electric and magnetic fields 


5.2.1 Setting up the problem 


y= = 
BE 
B 
y= ¢ 
x= W x 


Figure 2: Idealized Hall effect system 


In this section we ignore all the complexities of a physical conductor — 
electron ion interactions, thermal effects, impurities, and so forth — and 


OR. L. Jaffe MIT Quantum Theory Notes 31 


consider the idealized problem of a gas of electrons moving in the xy-plane 
subject to an electric field, B= — Fé, in the negative y-direction, and a 
magnetic field, B = —Boés in the negative z-direction. We assume that 
the conductor forms a strip with 0 < « < W and study a section between 
y = —L/2 and y = L/2. (See Figure 2.) We assume that the magnetic 
field is sufficiently strong that only one of the two electron spin states is of 
interest. The other is promoted to higher energy by the dipole interaction 
energy, [/Bo. 

To construct the Hamiltonian we need an electrostatic potential to pro- 
duce E — ® = Eoy — will do; and a vector potential to produce B— 
A= Boyé,. Note that we have chosen a different vector potential than we 
used in our solution to the Landau problem. This choice is more convenient 
here, but the physics cannot depend on it. The Hamiltonian is given by, 


oO? 
2 
OE me he 


| + eEoy. (70) 


As usual, it is convenient to introduce some scaled variables. We define the 
Larmor frequency w; = eBy/mc as usual, and introduce the natural length 
scale of the Landau problem, f) = \/hc/eBo. If we now scale the dimensional 
factors out of (70) we obtain, 


hwy 0 yo 
a eter ig, a) il Ore 5 Zan (71) 
where 
gE = x/Lo 
= y/o 


a = eEolo/hwr. 
(72) 
€ and 7 are scaled coordinates and a measures the energy scale of electric 


relative to magnetic effects. 
We now take (€, 17, pe, p,) to be our canonical variables, so 


hus 
H= oa [(pe — 9)? + pe + 2a] , (73) 


where pe = —10/0€ and p, = —i0/On. 


OR. L. Jaffe MIT Quantum Theory Notes a2 


5.2.2 Eigenstates and eigenenergies 


It is quite straightforward to find the eigenenergies and eigenstates of this 
Hamiltonian. First note that pe is a constant of the motion, [pe,H] = 0. 
We denote the eigenvalue of pe by k. An eigenstate of pe can be written in 
coordinate space as, 


w(é,n) = e* p(n, k) with 


H(ne(n,k) = mL [v7 + 2an + (k — n)?] en, k) 
= mn [p, + (n —k +0)? + 2ak — 07] y(n, k). (74) 


This is simply a one-dimensional harmonic oscillator centered at 7 =k —a 
with eigenenergies shifted by hw, (ak — a?/2), 


2 


En(k) = hwy (n+ 5) + fr (ak — =) (75) 


The associated wavefunction, y,(7,k) is a gaussian (multiplying a Hermite 
polynomial) strongly localized around 7 = k — a, 


1 
vr(n, k) = exp —5|n— k + a)’ Hn(n — k +). (76) 


The continuous variable k labels the degeneracy of the Landau levels if E=0. 
When EF ¥ 0, the highly degenerate Landau level spreads out into a band 
with energies that depend on F through a as displayed in eq. (75). 


5.2.3. Degeneracies 


When E = 0 this problem reduces to motion in a constant magnetic field, Bo, 
a problem we have just solved, albeit with a different choice for the vector 
potential A. We can use our earlier solution to help us learn how to count 
and label the states in the case E # 0. 

First consider E = 0. When we studied Landau levels we learned to 
expect one state in each energy level in an area containing one quantum of 
flux, ®) = hc/e. So we expect each Landau level to be N-fold degenerate, 
where N=®/®g, and © is the flux, 6 = LW Bo, passing through the region 
0<a< W and —-L/2 < y < L/2. We can relate this degeneracy to the 
allowed values of k, which labels the degeneracy in eq. (75). From (76) with 


©R. L. Jaffe © MIT Quantum Theory Notes BO 


a = 0 we see that the state with “momentum” k is localized at n =k or y= 
ko. Thus the states in the region —L/2 < y < L/2 correspond to a definite 


range of k, 
L L 
—— <k<—. reg 
2ly — ~ 25 wD 
We can associate an interval in k with each state by noting that N states 
must fit into the k-range given by (77) — 


L 


Aka =, 78 
iN (78) 
If we substitute N = ®/®o, we obtain, 
275 
Ak= ; 79 
= (79) 
The allowed values of k can therefore be labelled by an integer, p, 
27rL 
i= 80 
p W DAG (80) 


for —LW/47 3 < p < LW/4nr€3. The constant ¢ cannot be determined by 
what we have done so far. 

To summarize: For E = 0 we have found that the known degeneracy 
of the Landau levels quantizes the allowed values of the “momentum”, k, 
according to the rule (80). Referring back to the wavefunction (74), we find 
that (80) amounts to a periodicity requirement, v(x = 0,y) = v(a = W,y) 
up to the phase CW. The phase CW plays no role in the physics, so we set 
¢ to zero henceforth. 

Now we return to the case of interest, namely a 4 0. We assume that the 
wavefunction remains periodic in x: w(« = 0,y) = w(a = W,y), and use the 
y-dependence to find the range of k. For a # 0, the eigenstates are centered 
at 7 = k—a. This translates into the quantization rule k = (21l)/W)p with 


IW aW LWW 
Anl% nly — Anl% — Arbo 


(81) 


The only effect of the electric field is to shift the allowed values of the “mo- 
mentum” k. 

Let us summarize our solution to the problems of electrons propagating 
in constant crossed electric and magnetic fields — 


OR. 


L. Jaffe MIT Quantum Theory Notes 34 


The states are labeled by quantum numbers k and n, 


Ly 1 
w(.n) = 4 we exp—sln—k+a)Ha(n—k+a) (82) 


where we have introduced a factor \/fo/W so that the wavefunction 
is normalized to unity over the rectangle of area LW. (We assume 
exp(—n?/2)H,,(7) is normalized in 77.). 


The eigenenergies are 


1 2 
En(k) = hwy (n +5) + Rion (ak — =). (83) 
The quantum numbers n and k range over the values, 
Gio = Og 2s oe 2 
L L 
—_—— Se ee 4 
OS og ee ee) 


The degeneracy of the Landau levels is broken by the electric field. Each 
Landau level breaks up into a “band” of very closely spaced levels. The 
width of the band is AE = hwraLl/lo = eEoL, which is just the change 
in the classical electrostatic energy of the electron over the length of 
the conductor. As long as the magnetic field is strong we can assume 
that the now-smeared-out Landau levels remain well separated from 
one another. 


The other effect of the electric field is to shift the average “momentum” 
of the electrons in the Landau bands from zero to a. We shall now see 
that this shift has the effect of producing the classical Hall current — 
with a quantum twist... 


5.2.4 The Hall current 


Now 


let us calculate the current that flows in the x-direction in response 


to the electric field in the —y direction. The electric current carried by a 
quantum gas of electrons is its probability current multiplied by the elec- 
tric charge g = —e. As you show on a problem set, the expression for the 
probability current density is 


j= Nb SA (35) 
mm Mc 


©R. L. Jaffe MIT Quantum Theory Notes 30 


In the gauge we are using, A, = 0. Given this, and given the fact that the 
harmonic oscillator wavefunctions, H,,, are real, we conclude that there is no 
current in the y (€2) direction. This is remarkable since €, is the direction 
of the external electric field! The x-dependence of v(z,y) is complex, so 
a current in the é€; direction may exist. Substituting (82), multiplying by 
—e, we obtain a current density in the é€; direction associated with the state 
labeled by n and k. This current density depends on the details of the 
harmonic oscillator wavefunctions. If we integrate over y from —L/2 to L/2 
we obtain a simple expression for the current in the é€, direction from an 
electron with “momentum” k in the n“” Landau level, 


ehk 


(86) 
where the subscript H on J reminds us that this is the Hall current. Note 
that the contribution of the Ay*w term in the current density (85) is odd in 
y and so vanishes once we integrate over y. 

Now let us suppose that all states in a given Landau band are filled and 
calculate the associated current. This means that we sum (86) over the 
allowed range of k. For a = 0 the negative and positive values of k cancel 
and we get 7 = 0 fora =0. Fora + 0 we replace the sum over p by an 


integral, 
oh P+ 


Iy(n) = —inlW k(p)dp (87) 


where p, are the limits on p given in eq. (81). This yields, 


2 

In(n) = —+ Eyl (88) 
corresponding to a Hall conductance of 0 = e?/h independent of the size of 
the sample (L and W) and n — the label of the Landau band. If N Landau 
bands are filled then the conductance is —Ne?/h. In the matrix notation 
of the previous section, we have found that the idealized quantum problem 
leads to a purely off-diagonal conductance matrix 


Oi 4 nie h ‘i (89) 


The resistance matrix is the inverse of o. 


OR. L. Jaffe MIT Quantum Theory Notes 36 


To summarize: Jf the idealized quantum treatment is justified, the con- 
ductance is purely off-diagonal: an external field in the é, direction leads to 
a current in the é; direction; and 7f N Landau bands are filled then the off 
diagonal (Hall) conductance is quantized in multiples of e?/h. 


5.2.5 A description of the integer quantum Hall effect 


Image by MIT OpenCourseWare, 
adapted from R. E. Prange and S. M. Girvin, The Quantum Hall Effect, Springer-Verlag, Berlin, 1987. 


Figure 3: (From R. E. Prange and S. M. Girvin, The Quantum Hall Effect, 


Springer-Verlag, Berlin, 1987. 
Now it is necessary translate the rather abstract calculation we have just 


completed into a description of the effect observed by von Klitzing and oth- 
ers. I don’t have the time or expertise to do justice to the richness of the 
phenomena in realistic conductors, but I will try to sketch the physics in a 
somewhat idealized situation. 

Figure 4 shows the results of a measurement of the Hall resistance (on 
the left vertical axis) and the longitudinal resistance (on the right vertical 
axis) as a function of nhc/eBo, measured on an idealized sample at low 
temperature. Two effects stand out: (1) the Hall conductance (1/Ry) is 
quantized in integer multiples of e?/h over ranges of magnetic field strength. 
The plateaus in 1/Ry are very flat and separated by steep increases as 1/Ry 
grows from one integer value to another. (2) The longitudinal resistance, 
on the other hand, vanishes when Ry is constant, and is finite when Ry is 
varying. The longitudinal resistance depends on the geometry of the sample 
as it does for normal metals under normal conditions. The possible values of 
the Hall resistance, on the other hand, are independent of the geometry of the 


OR. L. Jaffe MIT Quantum Theory Notes or 


— 
WwW 
= 
Cc 
=) 
> 
L 
40) 
L 
aa 
a 
L 
<x 
Ne 
1 
ad 


Figure by MIT OpenCourseWare, adapted from 
R. E. Prange and S. M. Girvin, The Quantum Hall Effect, Springer-Verlag, Berlin, 1987. 


Figure 4: Longitudinal and transverse resistance for the quantum Hall effect. 
(From Prange and Girvin.) 


sample. It depends on the sample only through the magnetic field strength, 
Bo and the charge density, n. Note that a smooth line interpolating through 
the steps in Ry gives the classical relationship 1/Ry = nec/ Bo as expected 
from the classical analysis at the beginning of this section (see eq. (69)). 

To understand Fig. 4 it is necessary to figure out what is happening 
to the electron spectrum as nhc/eBpo is increased. Since electrons obey the 
Pauli exclusion principle, only two electrons (one of each spin projection) 
can be placed in each spatial state. For By = 0 the states available to the 
electrons form (essentially) a continuum. For By = 0 the continuum breaks 
up into Landau levels separated by gaps. Each Landau level can hold one 
electron in an area corresponding to a single flux quantum. That area is 
given by hc/eBoy. The number of electrons that can be accomodated in a 


©R. L. Jaffe MIT Quantum Theory Notes 38 


given Landau level grows as Bo increases. When Bo is extremely large, all 
the electrons can be placed in the lowest Landau level. For large Bo, the 
splitting between spin up and spin down electron levels also becomes large, 
so we can ignore the higher spin state. The ratio of the number of electrons 
per unit area, n, and the number per unit area per Landau level, eBo/hc, 
gives us the number of Landau levels that must be occupied as a function of 
Bo (at fixed n), 

N(Bo) = nhc/eBo, (90) 


which is exactly the independent variable in Fig. 4. 

We know from the analysis of the previous section that the Hall conduc- 
tance is quantized when a Landau level is exactly full. Fig. 4 together with 
our interpretation of the variable nhc/eBy seems to be saying that 1/Ry 
behaves as though the Landau level were exactly full for a range of Bo. 

The key to understanding the integer quantum Hall effect is the role of 
impurities. We have treated the electrons as an ideal gas, free to orbit in Lan- 
dau levels. Real experiments are done using almost-two-dimensional electron 
systems called “inversion layers”, in which the electrons are essentially free to 
move in the xy-plane but are localized in z at a planar interface between one 
type of semi-conductor and another. In any real system, there are impurities. 
This means that while many electron energy levels are delocalized as in the 
ideal case — allowing them to move through the two-dimensional system and 
respond to external E and B fields, other energy levels remain pinned, ie. 
localized, around impurities. Electrons that fill these localized levels do not 
conduct current. A picture of the energy levels in a realistic two-dimensional 
electron gas is shown in Fig. 5: localized states in between bands (in this case 
Landau levels) of delocalized states. The striking appearance of the integer 
quantum Hall effect can be understood by considering the sequential filling 
of localized and delocalized states as a function of 1/Bo. 

We begin with very large Bo, at the far left of Fig. 4. The lowest Landau 
level is only partially filled. Electrons in a partially filled Landau level give 
rise to a Hall conductance that is a fraction (the “filling fraction”) of the 
quantum e?/h. As Bo decreases, the Landau level fills. The conductance is 
exactly e?/h. As Bo decreases further electrons are forced into higher localized 
levels between the first and second Landau levels. These electrons do not 
conduct, so the Hall conductance stays fixed at e?/h. When Bo decreases 
still further, the second Landau level quickly fills and the conductance rises 
to 2e?/h. Succeeding intervals of filling localized and delocalized levels gives 


©R. L. Jaffe MIT Quantum Theory Notes 39 


rise to the step-like pattern shown in Fig. 4. Eventually, at low magnetic 
field strength, the steps smooth out to follow the interpolating dashed line 
that marks the classical Hall resistance. 

The explanation of the behavior of the longitudinal resistance also de- 
pends crucially on the existence of impurities. In the last section we found 
that electrons in Landau levels do not contribute to the longitudinal conduc- 
tivity of a sample. This idealization breaks down in the presence of electron 
scattering from impurities. Such scattering generates a normal, longitudinal 
resistance when Landau levels are partially full. However, when a given Lan- 
dau level is exactly full, the electrons have no unoccupied levels into which 
to scatter and the longitudinal resistance vanishes. In the domains of Bo 
where the localized states are filling the situation remains unchanged — the 
localized electrons do not participate. When a Landau level again begins to 
fill, scattering again generates a longitudinal resistance. 

We have only scratched the surface of the myriad of phenomena, asso- 
ciated with the behavior of materials in the presence of magnetic fields. 
Many aspects of these systems are studied by condensed matter theorists 
and experimenters in the MIT Physics Department. For further readings we 
recommend the book by Prange and Girvin. 


OR. L. Jaffe MIT Quantum Theory Notes 40 


N(E) 


== Extended States 
____ Localized States 


Image by MIT OpenCourseWare, adapted from R. E. Prange and S. M. Girvin, 
The Quantum Hall Effect, Springer-Verlag, Berlin, 1987. 


Figure 5: Localized and extended states in a “realistic” two-dimensional 
electron gas. (From Prange and Girvin.) 


